Patentable/Patents/US-12585476-B2
US-12585476-B2

Multi-dimensional auto scaling of container-based clusters

PublishedMarch 24, 2026
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

The disclosure provides a method for determining a target configuration for a container-based cluster. The method generally includes determining, by a virtualization management platform configured to manage components of the cluster, a current state of the cluster, determining, by the virtualization management platform, at least one of performance metrics or resource utilization metrics for the cluster based on the current state of the cluster, processing, with a model configured to generate candidate configurations recommended for the cluster, the current state and at least one of the performance metrics or the resource utilization metrics and thereby generate the candidate configurations, calculating a reward score for each of the candidate configurations, selecting the target configuration as a candidate configuration from the candidate configurations based on the reward score of the target configuration, and adjusting configuration settings for the cluster based on the target configuration to alter the current state of the cluster.

Patent Claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

. A method for determining a target configuration for a container-based cluster, the method comprising:

2

. The method of, wherein selecting the target configuration comprises selecting the target configuration from the plurality of candidate configurations having the highest reward score among the reward scores calculated for the plurality of candidate configurations.

3

. The method of, wherein adjusting the configuration settings for the container-based cluster comprises automatically adjusting at least one of:

4

. The method of, further comprising:

5

. The method of, wherein:

6

. The method of, wherein:

7

. The method of, wherein:

8

. The method of, further comprising generating each of the plurality of training data instances, wherein generating a training data instance of the plurality of training data instances comprises:

9

. The method of, further comprising:

10

. A system comprising:

11

. The system of, wherein selecting the target configuration comprises selecting the target configuration having the highest reward score among the reward scores calculated for the plurality of candidate configurations.

12

. The system of, wherein adjusting the configuration settings for the container-based cluster comprises automatically adjusting at least one of:

13

. The system of, wherein the one or more processors and the at least one memory are further configured to:

14

. The system of, wherein:

15

. The system of, wherein:

16

. The system of, wherein:

17

. The system of, wherein the one or more processors and the at least one memory are further configured to generate each of the plurality of training data instances, wherein to generate a training data instance of the plurality of training data instances comprises to:

18

. The system of, wherein the one or more processors and the at least one memory are further configured to:

19

. A non-transitory computer-readable medium comprising instructions that, when executed by one or more processors of a computing system, cause the computing system to perform operations for determining a target configuration for a container-based cluster, the operations comprising:

20

. The non-transitory computer-readable medium of, wherein selecting the target configuration comprises selecting the target configuration from the plurality of candidate configurations having the highest reward score among the reward scores calculated for the plurality of candidate configurations.

Detailed Description

Complete technical specification and implementation details from the patent document.

Modern applications are applications designed to take advantage of the benefits of modern computing platforms and infrastructure. For example, modern applications can be deployed in a multi-cloud or hybrid cloud fashion. A multi-cloud application may be deployed across multiple clouds, which may be multiple public clouds provided by different cloud providers or the same cloud provider or a mix of public and private clouds. The term, “private cloud” refers to one or more on-premises data centers that may have pooled resources allocated in a cloud-like manner. Hybrid cloud refers specifically to a combination of public cloud and private clouds. Thus, an application deployed across a hybrid cloud environment consumes both cloud services executing in a public cloud and local services executing in a private data center (e.g., a private cloud). Within the public cloud or private data center, modern applications can be deployed onto one or more virtual machines (VMs), containers, application services, and/or the like.

A container is a package that relies on virtual isolation to deploy and run applications that depend on a shared operating system (OS) kernel. Containerized applications (also referred to as “containerized workloads”), can include a collection of one or more related applications packaged into one or more containers. In some orchestration systems, a set of one or more related containers sharing storage and network resources, referred to as a pod, are deployed as a unit of computing software. Container orchestration systems automate the lifecycle of containers, including such operations as provisioning, deployment, monitoring, scaling (up and/or down), networking, and load balancing.

Kubernetes® (K8S®) software is an example open-source container orchestration platform that automates the deployment and operation of such containerized applications. At a high level, the Kubernetes platform is made up of a central database containing Kubernetes objects, or persistent entities, that are managed in the platform. Kubernetes objects are represented in configuration files, such as JavaScript Object Notation (JSON) or YAML files, and describe the intended state of a Kubernetes cluster of interconnected nodes used to run containerized applications. A node may be a physical machine, or a VM configured to run on a physical machine running a hypervisor. The intended state of the cluster includes intended infrastructure (e.g., pods, containers, etc.) and containerized applications that are to be deployed in the cluster. In other words, a Kubernetes object is a “record of intent”-once an object is created, the Kubernetes system will constantly work to ensure that object is realized in the deployment.

There are two categories of objects in Kubernetes: native Kubernetes objects and custom resource definition (CRD) objects (also referred to herein as “custom resources”). Native Kubernetes objects include pods, services, volumes, namespaces, deployments, replication controllers, ReplicaSets, and/or the like which are supported and can be created/manipulated by a Kubernetes application programming interface (API). The Kubernetes API is a resource-based (e.g., RESTful or representational state transfer architectural style) programmatic interface provided via HTTP. A CRD object, on the other hand, is an object that extends the Kubernetes API or allows a user to introduce their own API into a Kubernetes cluster.

Kubernetes is designed to accommodate any number and/or type of configurations, as long as certain limitations are not exceeded (e.g., no more than a first threshold number of pods per node, no more than a second threshold total number of pods, no more than a third threshold total number of nodes, etc.). Deriving the “best” configuration for deploying containerized applications, however, is a technically challenging problem. In particular, a “best” configuration is a configuration that balances cost, resource consumption, and application performance/availability, while keeping the platform operations simple. Understanding a number of pods, a number of nodes, node size and capacity, a number of ReplicaSets (e.g., number of pod replicas), pod resource requirements, pod resource limitations, etc. to implement in the container-based cluster such that these variables are optimized may not be obvious or easily determined. In fact, determining the “best” configuration generally requires a trial-and-error process resulting in (1) many terminated/restarted pods in cases of under-allocation and (2) wasted resources in cases of over-allocation. Further, such configurations may need to be constantly updated as application requests change over time.

Some conventional approaches to determining the “best” configuration for a container-based cluster have been more empirical in nature. For example, a conventional approach is to rely on a human expert's knowledge in identifying the cluster configuration for deployment. Experts may need to have knowledge of workload patterns, application implementations, as well as an understanding of the infrastructure requirements and limitations when deriving the configuration. As this is an inherently subjective method, it is not repeatable or scalable. Accordingly, conventional manual methods for determining cluster configurations that seek to optimize cost, resource consumption, application performance, and operation complexity may not be effective.

Further, to account for changes in the number of applications deployed and/or resource requests by different applications over time, conventional approaches have developed several tools, including a horizontal pod autoscaler (HPA), a vertical pod autoscaler (VPA), and a cluster autoscaler (CA). These autoscalers are designed to help guarantee availability in Kubernetes by providing automatic scalability of application resources to adapt to varying load.

The HPA is a tool designed to automatically update workload resources, such as Deployments and ReplicaSets (e.g., designed to manage the deployment and scaling of a set of pods), scaling them to match the demand for applications in a container-based cluster. Horizontal scaling refers to the process of deploying additional pods in the cluster in response to increased load and/or removing pods in the container-based cluster in response to decreased load. In some cases, the HPA is designed to automatically increase and/or decrease the number of pod replicas in the cluster based on actual usage metrics, such as central processing unit (CPU) and/or memory utilization. In certain embodiments, the HPA is implemented as a control loop to scale pod replicas based on the ratio between desired metric values and current metric values. The choice of desired usage metric values imposes a tradeoff between application availability and operation costs.

The VPA is a tool designed to automatically adjust resource limits and/or resource requests (e.g., with respect to CPU and/or memory) to help ensure that pods are operating efficiently at all times. The VPA determines the adjustment by analyzing historic memory and/or CPU usage, as well as current memory and/or CPU usage, by containers running in pods. In certain embodiments, the VPA provides recommended values for resource requests and/or limits that a user can use to manually update the configuration. In certain embodiments, the VPA automatically updates the configuration based on these recommended values.

The CA is a standalone program that scales up or down a number of nodes in container-based cluster to help ensure that there are enough nodes to run all requested pods, and to remove excess nodes from the cluster. For example, the CA is designed to add nodes to the cluster when there are pending pods that cannot be scheduled on any of the existing nodes due to insufficient resources among the existing nodes. Additionally, the CA is designed to reduce a number of nodes in the cluster when at least one node is consistently not needed, and when pods running thereon are capable of being transferred to a different node for execution. As such, the CA is designed to act in a reactive manner by adding and/or removing nodes based on new pods added and/or removed. In some cases, this causes new pods added to the cluster to be in a pending state until an additional node is provisioned, thereby affecting application performance. In some CA implementations, paused pods are deployed to ameliorate this issue by reserving space to trick the CA into adding extra nodes prematurely. This, however, introduces additional operational complexity and cost into the system.

In other words, the above-described autoscalers automate the process of scaling applications to adapt to workload changes. However, the autoscalers themselves need to be configured and tuned over time, thereby introducing non-trivial setup and maintenance overhead. Moreover, these autoscaler tools may not work well together (if at all), and, as such, may add to the complexity of setup, tuning, and maintenance of these autoscalers when deployed. For example, currently, the HPA and VPA are not designed to be used simultaneously on the same resource, given they both attempt to optimize the resource in different ways.

It should be noted that the information included in the Background section herein is simply meant to provide a reference for the discussion of certain embodiments in the Detailed Description. None of the information included in this Background should be considered as an admission of prior art.

One or more embodiments provide a method for determining a target configuration for a container-based cluster. The method generally includes determining, by a virtualization management platform configured to manage components of the container-based cluster, a current state of the container-based cluster. The method generally includes determining, by the virtualization management platform, at least one of performance metrics or resource utilization metrics for the container-based cluster based on the current state of the container-based cluster. The method generally includes processing, with a model configured to generate a plurality of candidate configurations recommended for the container-based cluster, the current state and at least one of the performance metrics or the resource utilization metrics and thereby generate the plurality of candidate configurations. The method generally includes calculating a reward score for each of the plurality of candidate configurations. The method generally includes selecting the target configuration as a candidate configuration from the plurality of candidate configurations based on the reward score of the target configuration. The method generally includes adjusting configuration settings for the container-based cluster based on the target configuration to alter the current state of the container-based cluster.

Further embodiments include a non-transitory computer-readable storage medium comprising instructions that cause a computer system to carry out the above methods, as well as a computer system configured to carry out the above methods.

To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures. It is contemplated that elements disclosed in one embodiment may be beneficially utilized on other embodiments without specific recitation.

Techniques for performing multi-dimensional auto scaling in container-based clusters are described herein. Multi-dimensional auto-scaling involves dynamically scaling pods (e.g., a number of pods and/or resources allocated to the pods) and/or nodes (e.g., number of nodes) up and/or down based on workload changes in a container-based cluster.

In particular, embodiments described herein utilize machine learning model(s) to generate and evaluate different candidate configurations for a container-based cluster for purposes of determining a target configuration for the cluster that provides improved workload performance and availability, while also limiting cost, resource consumption, and overall complexity over other candidate configurations. In certain embodiments, the identified target configuration is recommend to a user to prompt the user to manually update the configuration for the cluster. In certain other embodiments, the identified target configuration is used to automatically adjust configuration settings for the cluster. Adjusting configuration settings for the cluster using the target configuration includes, at least, scaling a number of pods, an amount of resources allocated to the pods, and/or a number of nodes in the cluster up and/or down.

The candidate configurations described herein contain values for various configuration settings, including at least, values specifying a number of pods, resource requests and/or limits assigned to each of the pods, and a number of nodes to deploy in the cluster. Each of the candidate configurations are generated by the machine learning model(s) based on a current state, resource utilization metrics, and/or performance metrics collected for the cluster. The machine learning model(s) are then configured to predict future application performance and resource utilization for each of the candidate configurations. A target configuration is selected among the candidate configurations based on the model predictions and, in some cases, an optimization objective identified by a user. The optimization objective may specify what factors are to be given priority over others when selecting the target configuration. For example, an optimization objective may specify to select a candidate configuration that prioritizes application performance and availability over cost optimization.

Further, certain embodiments described herein provide a method for training machine learning model(s) to generate and evaluate the different candidate configurations such that a target configuration for the cluster can be determined. The method includes obtaining a plurality of training data instances, and using the training data instances to train the model(s). The training data instances are obtained by collecting performance metrics and resource utilization metrics for different configurations and simulated loads applied to the cluster. As such, each training data instance used to train the machine learning model(s) includes (1) a training input comprising configuration information and information about a simulated load applied to the cluster and (2) a training output comprising performance metrics and resource utilization metrics collected for the corresponding configuration and load.

The techniques described herein for performing multi-dimensional auto scaling in container-based clusters using machine learning model(s) provide significant technical advantages over conventional solutions, such as improved configuration optimization with respect to multiple facets of the cluster. Further, the use of machine learning model(s) in determining a cluster's target configuration helps improve the efficiency of determining a configuration (e.g., “best” configuration) for a cluster, as well as application performance when the target configuration is deployed for the cluster. For example, the target configuration applied to the cluster may define resource requests and/or limits for application running in the cluster such that the application is neither under-provisioned or over-provisioned, which often results in poor application performance due to a lack of available resources or negative impacts to cluster capacity and overall cost, respectively.

is a block diagram that illustrates a computing systemin which embodiments described herein may be implemented. Computing systemincludes one or more hosts, a management network, a data network, and a virtualization management platform.

Host(s)may be geographically co-located servers on the same rack or on different racks in any arbitrary location in the data center. Host(s)may be in a single host cluster or logically divided into a plurality of host clusters. Each hostmay be configured to provide a virtualization layer, also referred to as a hypervisor, that abstracts processor, memory, storage, and networking resources of a hardware platformof each hostinto multiple VMstoN (collectively referred to as VMsand individually referred to as VM) that run concurrently on the same host.

Host(s)may be constructed on a server grade hardware platform, such as an x86 architecture platform. Hardware platformof each hostincludes components of a computing device such as one or more processors (central processing units (CPUs)), memory (random access memory (RAM)), one or more network interfaces (e.g., physical network interfaces (PNICs)), storage, and other components (not shown). CPUis configured to execute instructions that may be stored in memory, and optionally in storage. The network interface(s) enable hoststo communicate with other devices via a physical network, such as management networkand data network.

In certain embodiments, hypervisorruns in conjunction with an operating system (OS) (not shown) in host. In some embodiments, hypervisorcan be installed as system level software directly on hardware platformof host(often referred to as “bare metal” installation) and be conceptually interposed between the physical hardware and the guest OSs executing in the VMs. It is noted that the term “operating system,” as used herein, may refer to a hypervisor. One example of hypervisorthat may be configured and used in embodiments described herein is a VMware ESXi™ hypervisor provided as part of the VMware vSphere® solution made commercially available by VMware, Inc. of Palo Alto, CA.

Each of VMsimplements a virtual hardware platform that supports the installation of a guest OSwhich is capable of executing one or more applications. Guest OSmay be a standard, commodity operating system. Examples of a guest OS include Microsoft Windows, Linux, and/or the like. Applicationsmay be any software program, such as a word processing program.

In certain embodiments, computing systemincludes a container orchestrator. The container orchestrator implements a container orchestration control plane (also referred to herein as the “control plane”), such as a Kubernetes control plane, to deploy and manage applicationsand/or services thereof on hostsusing containers. In particular, each VMincludes a container engineinstalled therein and running as a guest application under control of guest OS. Container engineis a process that enables the deployment and management of virtual instances, referred to herein as “containers,” in conjunction with OS-level virtualization on guest OSwithin VMand the container orchestrator. Containersprovide isolation for user-space processes executing within them. Containersencapsulate an application (and its associated applications) as a single executable package of software that bundles application code together with all of the related configuration files, libraries, and dependencies required for it to run.

Control planeruns on a cluster of hostsand may deploy containerized applicationsas containerson the cluster of hosts. Control planemanages the computation, storage, and memory resources to run containersin the host cluster. In certain embodiments, hypervisoris integrated with control planeto provide a “supervisor cluster” (i.e., management cluster) that uses VMsto implement both control plane nodes and compute objects managed by the Kubernetes control plane.

In certain embodiments, control planedeploys and manages applications as pods of containersrunning on hosts, either within VMsor directly on an OS of hosts. A pod is a group of one or more containersand a specification for how to run the containers. A pod may be the smallest deployable unit of computing that can be created and managed by control plane.

An example container-based cluster for running containerized applications is illustrated in. While the example container-based cluster shown inis a Kubernetes cluster, in other examples, the container-based cluster may be another type of container-based cluster based on container technology, such as Docker Swarm clusters. As illustrated in, Kubernetes clusteris formed from a cluster of interconnected nodes, including (1) one or more worker nodesthat run one or more podshaving containersand (2) one or more control plane nodeshaving control plane components running thereon that control the cluster (e.g., where a node is a physical machine, such as a host, or a VMconfigured to run on a host).

Each worker nodeincludes a kubelet. Kubeletis an agent that helps to ensure that one or more podsrun on each worker nodeaccording to a defined state for the pods, such as defined in a configuration file. Each podmay include one or more containers. The worker nodescan be used to execute various applicationsand software processes using containers. Further, each worker nodemay include a kube proxy (not illustrated in). A kube proxy is a network proxy used to maintain network rules. These network rules allow for network communication with podsfrom network sessions inside and/or outside of Kubernetes cluster.

Control plane(e.g., running on one or more control plane nodes) includes components such as an application programming interface (API) server, controller(s), a cluster store (etcd), and scheduler(s). Control plane's components make global decisions about Kubernetes cluster(e.g., scheduling), as well as detect and respond to cluster events.

API serveroperates as a gateway to Kubernetes cluster. As such, a command line interface, web user interface, users, and/or services communicate with Kubernetes clusterthrough API server. One example of a Kubernetes API serveris kube-apiserver. The kube-apiserver is designed to scale horizontally—that is, this component scales by deploying more instances. Several instances of kube-apiserver may be run, and traffic may be balanced between those instances.

Controller(s)is responsible for running and managing controller processes in Kubernetes cluster. As described above, control planemay have (e.g., four) control loops called controller processes, which watch the state of Kubernetes clusterand try to modify the current state of Kubernetes clusterto match an intended state of Kubernetes cluster.

Scheduler(s)is configured to allocate new podsto worker nodes.

Cluster store (etcd)is a data store, such as a consistent and highly-available key value store, used as a backing store for Kubernetes clusterdata. In certain embodiments, cluster store (etcd)stores configuration file(s), such as JavaScript Object Notation (JSON) or YAML files, made up of one or more manifests that declare intended system infrastructure and workloads to be deployed in Kubernetes cluster. Kubernetes objects, or persistent entities, can be created, updated and deleted based on configuration file(s)to represent the state of Kubernetes cluster.

A Kubernetes object is a “record of intent”—once an object is created, the Kubernetes system will constantly work to ensure that object is realized in the deployment. One type of Kubernetes object is a custom resource definition (CRD) object (also referred to herein as a “custom resource (CR)”) that extends API serveror allows a user to introduce their own API into Kubernetes cluster. In particular, Kubernetes provides a standard extension mechanism, referred to as custom resource definitions, that enables extension of the set of resources and objects that can be managed in a Kubernetes cluster.

As described above, Kubernetes is designed to accommodate any number and/or type of configurations (e.g., identified in configuration file(s)), as long as certain limitations are not exceeded. To determine a target configuration for a cluster that provides improved workload performance and availability, while also limiting cost, resource consumption, and/or overall complexity, embodiments described herein utilize machine learning model(s). The one or more machine learning models are (1) trained to generate a plurality of candidate configurations for the cluster, (2) predict future application performance and resource utilization for each of the candidate configurations, and (3) select a target configuration among the candidate configurations based on the model predications and, in some cases, an optimization objective identified by a user. Using machine learning model(s) to determine a target configuration for a cluster is illustrated in.

In particular,illustrates an example systemconfigured to perform multi-dimensional auto-scaling for a container-based cluster, according to an example embodiment of the present disclosure. As illustrated, systemincludes a Kubernetes cluster(e.g., described in detail with respect to), a virtualization management platform, and configuration prediction model(s). In certain embodiments, Kubernetes clusterincludes an automated configuration update process controller. Systemis configured to generate and evaluate candidate configurationsto determine a target configurationfor Kubernetes cluster. In this example, the determined target configurationis provided to Kubernetes clusterto define a new intended state for Kubernetes cluster. The new intended state defined for Kubernetes clustermay help to improve performance and availability of containerized applicationsrunning in cluster.

illustrates an example methodfor determining a target configuration for a container-based cluster, according to an example embodiment of the present disclosure. Methodmay be performed by systemto determine target configurationfor Kubernetes cluster. As such,are described in conjunction below.

As illustrated in, methodbegins, at operation, with a user interacting with a control plane of a container-based cluster to provide a configuration for the container-based cluster. The configuration defines an intended state for the cluster. For example, in, a user may interact with control plane nodeto provide configuration file(s). Configuration files, provided by the user, are made up of one or more manifests that declare intended system infrastructure and workloads to be deployed in Kubernetes cluster. Configuration file(s)include initial configuration settings for cluster, specifying a number of pods, a number of nodes, node size and capacity, a number of ReplicaSets (e.g., number of pod replicas), pod resource requirements, pod resource limitations, etc. to implement in Kubernetes cluster. As described in detail below, the initial configuration defined by the user in configuration file(s)may be adjusted based on a target configurationdetermined for clusterby configuration prediction model(s).

Methodproceeds, at operation, with the control plane modifying a current state of the container-based cluster to match the intended state of the cluster. As such, in, controller(s)in control plane nodework to modify the current state of Kubernetes clusterto match the intended state of Kubernetes clusterspecified in configuration file(s). In certain embodiments, modifying the current state of Kubernetes clusterto match the intended state includes deploying application(s)in container(s)on worker node(s).

Methodproceeds, at operation, with a virtualization management platform (e.g., configured to manage components of the container-based cluster) determining performance metrics for the containerized application(s) deployed and running in the container-based cluster. Further, at operationsand, the virtualization management platform additionally collects resource utilization metrics for the container-based cluster and information about the current state of the container-based cluster, respectively.

For example, in, at operations,, and, virtualization management platformcollects performance metrics, resource utilization metrics, and information about a current stateof Kubernetes cluster.

Performance metricsmay include performance information collected for each applicationrunning in Kubernetes cluster. The performance information may provide insight into CPU usage of each application(e.g., compared to a CPU limit assigned to the corresponding applicationor a podrunning the corresponding application) and/or memory usage of the each application(e.g., compared to a memory limit assigned to the corresponding applicationor a podrunning the corresponding application). The performance information may also include details about application availability (e.g., service uptime), latency, average response time (e.g., read and/or write), input/output (I/O) rate, user satisfaction, request error rates, and/or the like for each applicationrunning in Kubernetes cluster.

Resource utilization metricsmay include information about total CPU utilization and/or total memory utilization. Total CPU utilization is the average utilization percentage of the CPU allocated to a particular component (e.g., pod, worker node, etc.) in Kubernetes cluster, over an interval. Total memory utilization is the average utilization percentage of the memory allocated to a particular component in Kubernetes clusterover an interval. Resource utilization metricsmay provide insight into whether or not resources were adequate, under-provisioned, and/or over-provisioned for different components in Kubernetes cluster.

As described above, current statefor Kubernetes clustermay match the intended state declared for the cluster in configuration files. Current statedetermined by virtualization management platformmay include information about a number of pods deployed, a number of nodes deployed, resources allocated to different components, and/or other information about infrastructure and/or applicationsdeployed in Kubernetes cluster.

Methodproceeds, at operation, with processing, via a model(s) configured to generate a plurality of candidate configurations recommended for the container-based cluster (e.g., configuration prediction model(s)), the performance metrics, the resource utilization metrics, and the current state of the container-based cluster. The model(s) used to process these inputs may generate the plurality of candidate configurations for the cluster.

In certain embodiments, the model(s) is a digital twin model. The digital twin model establishes a virtual model of the container-based cluster based on the real-time current state, resource utilization metrics, and performance metrics collected for the cluster being modeled. Once the virtual model is informed with such data, the virtual model is used to run simulations, study performance issues, and/or generate possible improvements with respect to configuration settings (e.g., generate candidate configurations) for the cluster. In other words, the virtual representation created for the cluster is used to simulate behaviors of the cluster and predict application and cluster performance.

In certain embodiments, the digital twin model designed to generate candidate configurations recommended for the container-based cluster, is a neural-network based digital twin model. Neural networks generally include a plurality of connected units or nodes called artificial neurons. Each node generally has one or more inputs with associated weights, a net input function, and an activation function. Nodes are generally included in a plurality of connected layers, where nodes of one layer are connected to nodes of another layer, with various parameters governing the relationships between nodes and layers and the operation of the neural network. The layers may include an input layer, one or more hidden layers, and an output layer.

A model training component (not illustrated in) is generally configured to train the model(s) to generate recommended candidate configurations for different clusters. In certain embodiments, the model training component trains the model(s) using training data comprising a plurality of training data instances generated for the cluster being analyzed via use of a load generator (not illustrated in), as described in detail below with respect to.

Patent Metadata

Filing Date

Unknown

Publication Date

March 24, 2026

Inventors

Unknown

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Multi-dimensional auto scaling of container-based clusters” (US-12585476-B2). https://patentable.app/patents/US-12585476-B2

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.

Multi-dimensional auto scaling of container-based clusters | Patentable