Patentable/Patents/US-20260023608-A1

US-20260023608-A1

Estimation of Microservice System Capacity

PublishedJanuary 22, 2026

Assigneenot available in USPTO data we have

InventorsHui LI

Technical Abstract

Systems and methods include determination, for each of a plurality of different types of requests, of a sequence in which services are executed in response thereto, generation of a directed graph based on the sequences s, where each vertex of the directed graph represents a service, generation of a flow network graph by splitting each vertex of the directed graph into two vertices with a directed edge between, and associated with a capacity of the service represented by the vertex, determination of a maximum flow through the service system based on the flow network graph, determination of a residual capacity of each service based on the maximum flow and its capacity of each service, determination of services associated with a zero residual capacity, and increasing of computing resources available to the determined services.

Patent Claims

Legal claims defining the scope of protection, as filed with the USPTO.

a memory storing executable program code; and one or more processing units to execute the executable program code to cause the system to: determine, for each of a plurality of different types of requests, a sequence in which services of a service system are executed in response, where a first one of the determined sequences executed in response to a first type of request is different from a second one of the determined sequences executed in response to a second type of request; generate a directed graph based on the sequences of services, where each vertex of the directed graph represents a service of the services; generate a flow network graph by splitting each vertex of the directed graph into two vertices with a directed edge between, the directed edge associated with a weight representing a capacity of the service represented by the vertex; determine a maximum flow through the service system based on the flow network graph; determine a residual capacity of each service of the service system based on the maximum flow and the capacity of each service; determine one or more services associated with a non-zero residual capacity based on the determined residual capacities; and decrease computing resources available to the one or more services associated with a non-zero residual capacity. . A system comprising:

claim 1 determine a second one or more services associated with a zero residual capacity based on the determined residual capacities; and increase computing resources available to the second one or more services associated with a zero residual capacity. . The system of, the one or more processing units to execute the executable program code to cause the execution environment to:

claim 2 identification of a first path of services from a source of the flow network graph to a sink of the flow network graph; determination of a first minimum capacity of the services of the first path; association of a flow with the services of the first path based on the first minimum capacity; for each service of the first path, subtraction of the first minimum capacity from the capacity of the service and assignment of the difference as the residual capacity of the service; identification of a second path of services from the source of the flow network graph to the sink of the flow network graph; determination of a second minimum capacity of the services of the second path; association of a second flow with the services of the second path based on the second minimum capacity; and for each service of the second path, subtraction of the second minimum capacity from the capacity of the service associated and assignment of the difference as the residual capacity of the service. . The system of, wherein determination of a maximum flow through the service system comprises:

claim 3 identification of a third path of services from the source of the flow network graph to the sink of the flow network graph, wherein a service of the third path is also a service of the first path; determination of a third minimum capacity of the services of the third path; for the service of the third path which is also a service of the first path, determination that a flow associated with the service of the third path is less than the third minimum capacity; and association of the third minimum capacity with the service of the third path; and subtraction of a difference between the third minimum capacity and the first minimum capacity from the assigned residual capacity of the service of the third path. in response to the determination that the flow associated with the service of the third path is less than the third minimum capacity: . The system of, wherein determination of a maximum flow through the service system comprises:

claim 1 identification of a first path of services from a source of the flow network graph to a sink of the flow network graph; determination of a first minimum capacity of the services of the first path; association of a flow with the services of the first path based on the first minimum capacity; for each service of the first path, subtraction of the first minimum capacity from the capacity of the service and assignment of the difference as the residual capacity of the service; identification of a second path of services from the source of the flow network graph to the sink of the flow network graph; determination of a second minimum capacity of the services of the second path; association of a second flow with the services of the second path based on the second minimum capacity; and for each service of the second path, subtraction of the second minimum capacity from the capacity of the service associated and assignment of the difference as the residual capacity of the service. . The system of, wherein determination of a maximum flow through the service system comprises:

claim 5 identification of a third path of services from the source of the flow network graph to the sink of the flow network graph, wherein a service of the third path is also a service of the first path; determination of a third minimum capacity of the services of the third path; for the service of the third path which is also a service of the first path, determination that a flow associated with the service of the third path is less than the third minimum capacity; and association of the third minimum capacity with the service of the third path; and subtraction of a difference between the third minimum capacity and the first minimum capacity from the assigned residual capacity of the service of the third path. in response to the determination that the flow associated with the service of the third path is less than the third minimum capacity: . The system of, wherein determination of a maximum flow through the service system comprises:

claim 1 . The system according to, wherein the capacities of two or more of the services are different.

determining, for each of a plurality of different types of requests, a sequence in which services of a service system are executed in response to the request; generating a directed graph based on the sequences of services, where each vertex of the directed graph represents a service of the services; generating a flow network graph by splitting each vertex of the directed graph into two vertices with a directed edge between, and associating the directed edge with a capacity of the service represented by the vertex; determining a maximum flow through the service system based on the flow network graph; determining a residual capacity of each service of the service system based on the maximum flow into each service and the capacity of each service; determining a one or more services associated with a zero residual capacity based on the determined residual capacities; and increase computing resources available to the one or more services associated with a zero residual capacity. . A method comprising:

claim 8 determining a second one or more services associated with a non-zero residual capacity based on the determined residual capacities; and decreasing computing resources available to the second one or more services associated with a non-zero residual capacity. . The method of, further comprising:

claim 9 identifying a first path of services from a source of the flow network graph to a sink of the flow network graph; determining a first minimum capacity of the services of the first path; associating a flow with the services of the first path based on the first minimum capacity; for each service of the first path, subtracting the first minimum capacity from the capacity of the service and assignment of the difference as the residual capacity of the service; identifying a second path of services from the source of the flow network graph to the sink of the flow network graph; determining a second minimum capacity of the services of the second path; associating a second flow with the services of the second path based on the second minimum capacity; and for each service of the second path, subtracting the second minimum capacity from the capacity of the service associated and assignment of the difference as the residual capacity of the service. . The method of, wherein determining a maximum flow through the service system comprises:

claim 10 identifying a third path of services from the source of the flow network graph to the sink of the flow network graph, wherein a service of the third path is also a service of the first path; determining a third minimum capacity of the services of the third path; for the service of the third path which is also a service of the first path, determining that a flow associated with the service of the third path is less than the third minimum capacity; and associating the third minimum capacity with the service of the third path; and subtracting a difference between the third minimum capacity and the first minimum capacity from the assigned residual capacity of the service of the third path. in response to determining that the flow associated with the service of the third path is less than the third minimum capacity: . The method of, wherein determining a maximum flow through the service system comprises:

claim 8 identifying a first path of services from a source of the flow network graph to a sink of the flow network graph; determining a first minimum capacity of the services of the first path; associating a flow with the services of the first path based on the first minimum capacity; for each service of the first path, subtracting the first minimum capacity from the capacity of the service and assignment of the difference as the residual capacity of the service; identifying a second path of services from the source of the flow network graph to the sink of the flow network graph; determining a second minimum capacity of the services of the second path; associating a second flow with the services of the second path based on the second minimum capacity; and for each service of the second path, subtracting the second minimum capacity from the capacity of the service associated and assignment of the difference as the residual capacity of the service. . The method of, wherein determining a maximum flow through the service system comprises:

claim 12 identifying a third path of services from the source of the flow network graph to the sink of the flow network graph, wherein a service of the third path is also a service of the first path; determining a third minimum capacity of the services of the third path; for the service of the third path which is also a service of the first path, determining that a flow associated with the service of the third path is less than the third minimum capacity; and associating the third minimum capacity with the service of the third path; and subtracting a difference between the third minimum capacity and the first minimum capacity from the assigned residual capacity of the service of the third path. in response to determining that the flow associated with the service of the third path is less than the third minimum capacity: . The method of, wherein determining a maximum flow through the service system comprises:

claim 8 . The method according to, wherein the capacities of two or more of the services are different.

determining, for each of a plurality of different types of requests, a sequence in which services of a service system are executed in response to the request; generating a directed graph based on the sequences, where each vertex of the directed graph represents a respective service of the services; generating a flow network graph by replacing each vertex of the directed graph with two vertices and a directed edge between the two vertices, and associating the directed edge with a capacity of the service represented by the replaced vertex; determining a maximum flow through the service system based on the flow network graph; determining a residual capacity of each service of the service system based on the maximum flow into each service and the capacity of each service; determining, based on the determined residual capacities, one or more services associated with a zero residual capacity and having an upstream service in one of the sequences with a non-zero residual capacity; and increase computing resources available to the determined one or more services. . One or more non-transitory computer-readable media storing program executable by one or more processing units of a computing system to cause the computing system to perform operations comprising:

claim 15 determining a second one or more services associated with a non-zero residual capacity based on the determined residual capacities; and decreasing computing resources available to the second one or more services associated with a non-zero residual capacity. . The one or more non-transitory computer-readable media of, the operations further comprising:

claim 16 identifying a first path of services from a source of the flow network graph to a sink of the flow network graph; determining a first minimum capacity of the services of the first path; associating a flow with the services of the first path based on the first minimum capacity; for each service of the first path, subtracting the first minimum capacity from the capacity of the service and assignment of the difference as the residual capacity of the service; identifying a second path of services from the source of the flow network graph to the sink of the flow network graph; determining a second minimum capacity of the services of the second path; associating a second flow with the services of the second path based on the second minimum capacity; and for each service of the second path, subtracting the second minimum capacity from the capacity of the service associated and assignment of the difference as the residual capacity of the service. . The one or more non-transitory computer-readable media of, wherein determining a maximum flow through the service system comprises:

claim 17 identifying a third path of services from the source of the flow network graph to the sink of the flow network graph, wherein a service of the third path is also a service of the first path; determining a third minimum capacity of the services of the third path; for the service of the third path which is also a service of the first path, determining that a flow associated with the service of the third path is less than the third minimum capacity; and associating the third minimum capacity with the service of the third path; and subtracting a difference between the third minimum capacity and the first minimum capacity from the assigned residual capacity of the service of the third path. in response to determining that the flow associated with the service of the third path is less than the third minimum capacity: . The one or more non-transitory computer-readable media of, wherein determining a maximum flow through the service system comprises:

claim 15 identifying a first path of services from a source of the flow network graph to a sink of the flow network graph; determining a first minimum capacity of the services of the first path; associating a flow with the services of the first path based on the first minimum capacity; for each service of the first path, subtracting the first minimum capacity from the capacity of the service and assignment of the difference as the residual capacity of the service; identifying a second path of services from the source of the flow network graph to the sink of the flow network graph; determining a second minimum capacity of the services of the second path; associating a second flow with the services of the second path based on the second minimum capacity; for each service of the second path, subtracting the second minimum capacity from the capacity of the service associated and assignment of the difference as the residual capacity of the service; identifying a third path of services from the source of the flow network graph to the sink of the flow network graph, wherein a service of the third path is also a service of the first path; determining a third minimum capacity of the services of the third path; for the service of the third path which is also a service of the first path, determining that a flow associated with the service of the third path is less than the third minimum capacity; and associating the third minimum capacity with the service of the third path; and subtracting a difference between the third minimum capacity and the first minimum capacity from the assigned residual capacity of the service of the third path. in response to determining that the flow associated with the service of the third path is less than the third minimum capacity: . The one or more non-transitory computer-readable media of, wherein determining a maximum flow through the service system comprises:

claim 15 . The one or more non-transitory computer-readable media according to, wherein the capacities of two or more of the services are different.

Detailed Description

Complete technical specification and implementation details from the patent document.

A microservice-based application consists of distinct functions implemented using independent microservices deployed within a microservice system. Each microservice is independently accessible and executes in its own computing process in a separate computing system (e.g., server/virtual machine/container). Many requests directed to a microservice-based application are processed using several microservices of the microservice system.

Microservices are often implemented in the cloud in order to leverage the redundancy, economies of scale and other benefits provided by cloud platforms. One such benefit is resource elasticity, which allows the computing resources (e.g., CPU power, memory size, and network bandwidth) consumed by a microservice to be efficiently scaled up and scaled down according to the needs of the microservice. For example, as CPU usage, memory usage, and/or RPS (incoming requests per second) of a microservice increase beyond a threshold, additional resources may be allocated to the microservice. Similarly, resources may be deallocated from the microservice if CPU usage, memory usage, and/or RPS decrease below a given threshold. Resource costs for operating the microservice may be thereby reduced in comparison to systems in which resources are fixedly allocated to serve a maximum anticipated workload.

Resources may be allocated to or deallocated from a microservice based on the capacity of the microservice. The capacity of a single microservice may be estimated by performing load tests during development. The capacity of a microservice is limited by many factors, such as its allocated CPU/memory/disk/network bandwidth.

On the other hand, it is difficult to accurately estimate the capacity of all individual microservices during productive use within a microservice system, hereinafter referred to as the microservice system capacity. The microservice system capacity may indicate where processing bottlenecks might occur and where resources may be over-allocated. Estimating the microservice system capacity is difficult due at least in part to the typically complex relationships between the microservices of the microservice system and the difficulty of simulating productive traffic levels while also monitoring and analyzing the health status of each microservice.

Knowledge of the microservice system capacity may assist in the distribution of hardware/software resources within a microservice system. Systems are desired for efficiently estimating the capacity of a microservice system.

The following description is provided to enable any person in the art to make and use the described embodiments. Various modifications, however, will remain readily-apparent to those in the art.

Some embodiments facilitate estimating the capacity of a microservice system. A microservice system may serve different types of requests and provide responses thereto. To provide such responses, the request is received and instructions are executed on data and with conditions in a structured manner also known as workflow. In other words, the workflow is the sequence of events that provide a response to a request. The execution or carrying-out of a particular type of workflow may require execution of a particular sequence of microservices, while serving another type of workflow may require execution of a different sequence of microservices. The sequences may include the same or different microservices.

Embodiments may operate to determine a sequence of microservices which is executed in response to each of several different types of requests that may require different workflows in order to generate the correct response. A directed graph is generated based on the sequences, where each vertex of the directed graph represents a microservice. To represent the ability of a microservice to conduct data executions, such as amount of CPUs available, amount of memory available, etc., a throughput of each microservice is created for a flow network graph by splitting each vertex of the directed graph into two vertices with a directed edge between, where the directed edge is weighted based on the workload capacity of the service represented by the vertex. A capacity of the microservice system is determined based on the flow network graph, a residual capacity graph is determined based on the capacity and the flow network graph, and microservices which are associated with residual capacity are determined based on the residual capacity graph. Embodiments may then operate to decrease resources available to the microservices which are associated with residual capacity and/or increase resources available to the microservices which are not associated with residual capacity. As a result, a more efficient allocation of resources may be achieved.

1 FIG. illustrates a system according to some embodiments. Each of the illustrated components may be implemented using any suitable combination of local, on-premise, cloud-based, distributed (e.g., including distributed storage and/or compute nodes) computing hardware and/or software that is or becomes known. Each component described herein may be executed by one or more physical and/or virtualized servers providing an operating system, services, I/O, storage, libraries, frameworks, etc. In some embodiments, two or more components are implemented by a single server.

1 FIG. One or more components may be implemented by a cloud service (e.g., Software-as-a-Service, Platform-as-a-Service). A cloud-based implementation of any components ofmay apportion computing resources elastically according to demand, need, price, and/or any other metric. A cloud-based implementation may utilize a container orchestration platform such as but not limited to Kubernetes. Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.

100 1 12 1 12 Microservice systemincludes microservices Sthrough S. Microservices Sthrough Smay be microservices of one or more microservice-based applications. The applications may comprise single and/or multi-tenant applications. The present description uses the term microservices to generally include microservices, services, applications, and any other independently-executable computing system-provided functionality.

1 12 1 12 Each of microservices Sthrough Smay be provided by a separate execution environment (e.g., a separate process in a separate computing system). Microservices Sthrough Smay communicate with one another and with other unshown microservices using lightweight network communication mechanisms such as a resource Application Programming Interface (API) via Hyper Text Transfer Protocol (HTTP) request-response messages, but embodiments are not limited thereto.

100 110 100 110 100 100 Microservice systemreceives incoming requests from external clients. For example, gatewayreceives a request (e.g., an API call) associated with an application provided by microservice systemfrom a client device. Gatewaydetermines a microservice of the microservice-based application to which the request should be forwarded depending on the type of the request. The microservice receives the request and executes corresponding processing which may include calling another microservice of microservice system. The called microservice may in turn call another microservice of microservice system, and so on until a response to the initial request is returned, which may rely upon intervening or intermediate responses from microservice-to-microservice requests.

120 100 1 12 1 12 120 Resource scaling componentmay determine a capacity of systemas described herein. The determination may be based on individual capacities of microservices Sthrough S. The individual capacities may be computed based on the resources of the respective microservices Sthrough Sor via testing, for example. Resource scaling componentmay determine the individual capacities via a monitoring function in some embodiments. The individual capacities may be represented as a number of requests per second (RPS) but embodiments are not limited thereto.

120 1 12 100 120 1 12 Resource scaling componentmay also determine re-allocations of computing resources among microservices Sthrough Sbased on the determined capacity of system. In some embodiments, resource scaling componentmay also or alternatively initiate the determined re-allocation of computing resources The re-allocation may be performed in any manner that is or becomes known. Cloud environments generally provide systems to elastically allocate computing resources to virtual machines based on demand. Microservices Sthrough Smay also be deployed in containers managed by a container orchestration platform which provides efficient autoscaling.

1 12 For example, each microservice Sthrough Smay be deployed using a plurality of pods, each of which independently provides an instance of its microservice. Each pod is a collection of one or more containers and runs on a virtual or a physical machine (i.e., a node) which may execute multiple pods. Resources may be allocated to or de-allocated from the microservice by adjusting the number of pods, the number of nodes and/or the computing resources of each node.

2 FIG. 200 200 is a flow diagram of processto estimate the capacity of a microservice system according to some embodiments. Processand the other processes described herein may be performed using any suitable combination of hardware and software. Program code embodying these processes may be stored by any non-transitory tangible medium, including a fixed disk, a volatile or non-volatile random-access memory, a DVD, a Flash drive, or a magnetic tape, and executed by any number of processing units, including but not limited to processors, processor cores, and processor threads. Such processors, processor cores, and processor threads may be implemented by a virtual machine provisioned in a cloud-based architecture. Embodiments are not limited to the examples described below.

210 100 3 FIG. 3 FIG. A sequence of services, or flow or workflow, to be executed for each of a plurality of request types is determined at S.depicts sequences of services, or workflows, to be executed for each of different types of requests. Although six workflows are depicted in, microservice systemmay implement one or more applications, each of which is responsive to any number of types of requests.

110 110 1 1 1 110 4 4 1 7 10 110 3 FIG. As shown, a request of a given request type is received by gatewayand, in response, a sequence of services (i.e., flow or workflow) for that request type is executed. The sequences and the service executed in each sequence may differ among request types. For example, gatewayroutes an incoming request of type WFto microservice S. For clarity in, all intermediate requests that are part of an individual workflow will share the same suffix integer (e.g., 1) after the prefix WF. Microservice Sexecutes or performs corresponding processing on the initial request received from gatewayand calls microservice S. Microservice Sthen executes or performs processing on the request from S, which may include all or a portion of the initial request, and calls microservice S, which executes and calls microservice S. Responses to the calls are returned through the microservices and a final response is returned to gateway.

3 FIG. 1 1 4 1 4 4 1 1 110 1 10 110 1 110 It should be noted that responses to initial and intermediate requests are not shown infor clarity. However, the responses to initial and intermediate requests can be represented by the shown arrows (e.g., the arrow WFfrom Sto Sis representative of the intermediate request issued from Sto Sas well as the response to that request issued from Sto S) or other responses could have been added to this figure without departing from the scope of the description herein (e.g., a final response to the initial request WFfrom gatewayto Scould be added from Sto gatewaywherein that response is then forwarded to the requestor who issued the request WFto gateway).

2 110 1 1 2 1 5 5 1 8 An incoming request of type WFis also routed from gatewayto microservice S. However, unlike incoming requests of type WF, processing of the request of type WFby microservice Sincludes calling microservice S. Microservice Sexecutes or performs corresponding processing on the request received from Sand calls microservice S.

8 9 A particular microservice may be used in the generation of a response to one or more types of requests. Moreover, a first microservice may call a second microservice during generation of a response to a request of a first type, while the second microservice may call the first microservice during generation of a response to a request of a second type. See, e.g., microservices Sand S.

220 Next, at S, a directed weighted graph is generated based on the sequence of services determined for each request type, where each vertex of the directed weighted graph represents a visited service. The directed weighted graph includes a directed edge which represents a capacity from the vertex representing a calling service to the vertex representing the called service. Multiple calls in the same direction between the same services are represented by a single edge. Initially, each edge is weighted with a value of infinity or other sufficiently high value so as not to unnecessarily impact the determinations as will be described later.

4 FIG. 3 FIG. 4 FIG. 4 FIG. 400 1 12 110 1 2 3 shows directed weighted graphbased on the sequences shown in. In particular,focuses on the respective capacities to process intermediate requests between services Sthrough Sand therefore omits the initial capacities of requests from gatewayto services S, Sand S. Also, as noted above, any capacity for intermediate responses to the intermediate requests may either be added or may be incorporated into the edges shown in.

400 1 12 2 5 3 6 400 8 9 2 6 Directed weighted graphincludes twelve vertices, each of which represents one of microservices S-S. The capacity to support multiple calls between services Sand S, and Sand Sare each represented by one edge of graph. In contrast, the capacity to support multiple calls between microservices Sand S, and Sand Sare represented by separate edges since they occur in opposite directions.

4 FIG. 400 110 does not include a singular source vertex nor a singular sink vertex for the weighted directed graph. According to some embodiments, other directed weighted graphs could include a singular source vertex and a singular sink vertex. Gatewaymay represent a source vertex, with its capacity for outgoing calls represented by directed edges as described above.

10 11 12 500 510 520 10 11 12 520 110 510 520 510 520 5 FIG. Some microservices do not issue requests to other microservices (e.g., S, Sand S). To facilitate the foregoing computation, a, virtual singular sink vertex can be added. The singular sink vertex does not represent any existing service. Edges are created from the last service of every workflow sequence to the sink vertex. As shown indirected weighted graphincludes source vertexand sink vertex, with edges from microservices S, Sand Sto sink vertex. In some implementations, gatewaymay be represented by the source nodeas well as the sink node, but, for purposes of computation, a source nodeseparate from sink nodeis used.

230 600 600 100 100 100 6 FIG. 6 FIG. At S, each vertex of the graph is split into two vertices with a directed edge from an input vertex to an output vertex. Moreover, the directed edge is weighted based on the capacity of the service represented by the original vertex.shows tableaccording to some embodiments. Tableprovides the capacity of each service of microservice system. The capacity is the amount any service, or edge, can process or transmit. Any additional demand for processing or transmission beyond the stated capacity will not be executed efficiently and will have to wait until earlier-initiated processes or transmissions are finished (e.g., a bottleneck). The capacities are provided in units of RPS, but embodiments are not limited thereto and could include some combination of available CPU or memory resources, I/O resources, etc. The capacity of each service may be determined based on the resources allocated to the service, testing results and/or by any other methods. As such, the values shown inmay be collected from microservice systemperiodically to determine the overall thereof and the adjustments can be made to microservice systemfollowing the descriptions provided herein.

7 FIG. 4 5 FIGS.and 2 FIG. 700 230 0 1 0 1 1 12 illustrates graphafter splitting of the vertices at S. The edges between split vertices S #_and S #_are weighted with the capacity of service S #, and the other edges remain weighted at infinity or other suitably high value. Leaving the original edges first shown inat infinity or a sufficiently high value, the further determinations made inare focused on the weighted edges between the split vertices S #_and S #_. This allows for determinations which are more focused on the throughput of each service based on the capacity therein (e.g., available resources) to each service and less focused on the bandwidth constraints between the individual services S-S.

1 12 20 4 6 FIG. 6 FIG. 6 FIG. 7 9 10 11 12 FIGS.,,,and In other implementations, the bandwidth between the individual services S-Scan be accounted for in the values attributed to the split vertices as shown in. In many computer software implementations, the use of infinity is not possible, and, in these situations, a suitably large number is assigned to the edges between different microservice edges instead of infinity. There are many ways to arrive at such a sufficiently large number such as taking the largest number shown inand multiplying by a factor (e.g.,) or adding all of the capacities shown intogether and multiplying that sum by a factor (e.g.,). For the sake of clarity,will use the symbol for infinity instead of a sufficiently large number.

700 Graphis considered a flow network graph in some embodiments. A flow is an amount of throughput through or resources used at a particular service or vertex and it cannot be greater than that service's or vertex's capacity.

240 700 100 10 1 3 FIGS.and 3 FIG. Next, at S, the maximum flow of the flow network graph is determined. It should be noted that a maximum flow is associated with the maximum flow through the entire graph, and thereby the corresponding microservice systemof. Accordingly, some individual workflows shown inmay be operating at maximum throughput, but other individual workflows may be operating at less than maximum throughput for the resources allocated to each service. That is, some workflows may operate at below maximum throughput so that other workflows may operate at a higher throughput to achieve an overall system that is operating at its maximum throughput. A more optimal flow through microservice systemmay be achieved by reallocating resources between different services, but this reallocation may still result in some services, and thereby some workflows, operating at below their maximum throughput.

240 The maximum flow may be determined at Sbased on flow network theory according to some embodiments. A brief description of the maximum flow problem follows, in which N=(V, E) is a network with s, t E V being the source and sink of N, respectively, and g is a function on the edges of N and its value on (u, v)∈E is g(u, v).

uv uv The capacity of an edge is the maximum amount of flow that can pass through the edge (i.e., c: E→). A flow is a map f: E→in which the flow of an edge cannot exceed its capacity. Stated differently, f≤cfor all (u, v)∈E. Moreover, except for the source and the sink, the sum of the flows entering a node must equal the sum of the flows exiting that node.

max The value of flow within the network is the amount of flow passing from the source to the sink. A determination of maximum flow seeks to find the flow fwhich represents the maximum possible flow from the source to the sink. Various algorithms for determining the maximum flow exist, such as but not limited to the Ford-Fulkerson algorithm, the Edmonds-Karp algorithm, and Dinic's algorithm.

8 FIG. 800 240 800 is a flow diagram of processto determine a maximum flow graph from a flow network graph at Saccording to some embodiments. Processmay comprise an implementation of the Ford-Fulkerson algorithm in some embodiments.

810 900 820 830 9 FIG. At S, the flows of all edges of a flow network graph are set to a weight of zero.illustrates network graphas updated to represent the flow and capacity of each service by labelling the directed edge of the service with <flow>/<residual capacity>, and with each flow term set to zero. Next, it is determined at Swhether the flow network graph includes a directed path from the source to the sink. If so, an augmenting path from the source to the sink is identified at S. The augmenting path may be identified using a depth-first search, a breadth-first search, etc.

840 2 5 8 11 2 850 860 1000 10 FIG. At S, the maximum possible flow that can be sent along this path is determined. The maximum possible flow may be determined by finding the minimum residual capacity of the edges in the path. For example, given the path from source to Sto Sto Sto Sto sink, the minimum residual capacity (i.e., the difference between a service's capacity and flow) is (4000−0)=4000, corresponding to service S. Next, at S, the maximum possible flow for the identified augmenting path is added to the flow values along the path. Similarly, at S, the maximum possible flow is subtracted from the residual capacity of each edge.illustrates updated flow network graphin which 4000 has been added to the <flow> values and has been subtracted from the <residual capacity> values.

800 820 800 830 840 850 860 1100 820 11 FIG. Processthen returns to Sto determine whether any other paths from the source to the sink exist in which all services have a residual capacity >0. Processtherefore iterates over S, S, Sand Suntil no such paths from the source to the sink remain.illustrates maximum flow graphwith <flow>/<residual capacity> values for each edge as they exist once it is determined at Sthat no other paths with residual capacity from the source to the sink exist.

820 860 860 Flow then proceeds from Sto S. At S, the capacity of the microservice system is determined based on the maximum flow graph. According to some embodiments, the capacity is determined as the sum of all flows leaving the source. In the present example, the sum of all flows leaving the source is 5000+4000+4500=13500.

200 250 Returning to process, a residual capacity graph is determined based on the maximum flow graph at S. As mentioned above, the edges of the maximum flow graph are labeled <flow>/<residual capacity>. The residual capacity of an edge is the difference between the edge's capacity and its flow from the maximum flow graph. The residual capacity graph includes the <residual capacity> values of each edge from the maximum flow graph but omits edges with a residual capacity=0.

12 FIG. 1200 illustrates residual capacity graphaccording to some embodiments. Edges having non-zero integer values represent services with excess capacity. The excess capacity results from flow limits imposed by the capacity of the downstream services.

260 260 6 8 9 1 2 3 260 3 4 5 9 10 11 12 Services representing bottlenecks and services associated with residual capacity are determined at S. A bottleneck is the minimum residual capacity of all the edges in a given source-to-sink path. The flow network is at maximum flow only if it includes a bottleneck with a value of zero. For example, it is determined at Sthat microservices S, S, Sare the bottlenecks of the system (i.e., assuming that entrance microservices S, S, Shave been configured to an optimum state of resource utilization). Moreover, it is determined at Sthat microservice Shas a residual capacity of 1500, microservice Shas a residual capacity of 3500, microservice Shas a residual capacity of 2500, microservice Shas a residual capacity of 300, microservice Shas a residual capacity of 2200, microservice Shas a residual capacity of 2100, and microservice Shas a residual capacity of 1600.

270 3 6 8 9 1 2 3 At S, resources of services representing bottlenecks are increased and resources of services associated with residual capacity are decreased. For example, to fully utilize the residual capacity of S, resources may be added to service Sto increase its capacity by 1500, and resources may be added to services Sand Sand increase their combined capacities by 1500. As a result of these changes, the maximum flow is 5000+4000+6000=15000, and the full capacities of entrance services S, S, Sare utilized.

4 5 10 11 12 3 1500 Service Shas a residual capacity of 3500 and service Shas a residual capacity of 2500. Resources allocated to these services may be decreased to reduce their residual capacity. Service Shas a residual capacity of 2200, service Shas a residual capacity of 2100, and service Shas a residual capacity of 1600. The resources allocated to these services may also be decreased, but while ensuring that their resulting capacities can bear the new flow of service S(i.e.,).

13 FIG. 6 FIG. 1300 270 600 6 8 9 4 5 10 11 12 Resources may be allocated to or de-allocated from a service in any suitable manner, including but not limited to changing a cloud configuration, adding or removing Kubernetes pods, etc.illustrates tableof microservice capacities after Saccording to some embodiments. Compared to tableof, the resources available to services S, Sand Shave been increased to increase their capacities as described above, while the resources available to services S, S, S, Sand Shave been decreased.

There are additional ways in which resources may be reallocated. Not every request can necessarily be processed in equal time. More complex requests may take more time to process than simpler requests. In other words, some services may see a different ratio of added resources (e.g., CPU units, memory units) to increased throughput (e.g., requests per second). As an example, adding or subtracting a CPU unit or core to a first service may yield an increased or decreased capacity change of 1000 RPS, respectively, while the addition or subtraction of a single CPU unit or core to a different service may only yield a change of 500 RPS. Thus, additional resources can be reallocated using a resource-to-throughput ratio.

3 FIG. 12 FIG. 1 2 3 4 1 2 1 2 7 1 3 7 2 4 5 6 1 2 110 7 1 3 1 2 6 8 5 6 3 6 8 In addition, resources can be reallocated with respect to specific workflows. As shown in, the initiating service of workflows WF, WF, WFand WFis either service Sor service S. From, resources are suggested for reallocation such that services S, Sand Swould be operating at maximum throughput (e.g., they do not have any excess resources). Workflows WFand WFutilize service Swhile the other workflows WF, WF, WFand WFdo not. Under the assumption that services Sand Scannot receive additional resources to increase their respective capacities, perhaps constrained by limitations imposed by gateway, it is therefore possible that increasing the resources to service Smay not yield increased throughput to workflows WFand WFas those workflows may already be limited by the services Sand S, respectively. In contrast, increasing resources to services Sand Smay improve the throughputs of workflows WFand WFas those two workflows have an initial service with extra capacity (e.g., service S) and have downstream services Sand Sacting as bottlenecks.

14 FIG. illustrates a cloud-based deployment according to some embodiments. The illustrated components may comprise cloud-based compute resources residing in one or more public clouds providing self-service and immediate provisioning, autoscaling, security, compliance, and identity management features.

1410 1440 1410 1440 1410 110 1420 1440 Execution environments-may comprise servers or virtual machines of a Kubernetes cluster. Execution environments-may support containerized applications which provide one or more services to users. Execution environmentmay execute a gateway such as gatewayand execution environments-may execute microservices of a microservice-based application as described herein.

The foregoing diagrams represent logical architectures for describing processes according to some embodiments, and actual implementations may include more, or different components arranged in other manners. Other topologies may be used in conjunction with other embodiments. Moreover, each component or device described herein may be implemented by any number of devices in communication via any number of other public and/or private networks. Two or more of such computing devices may be located remote from one another and may communicate with one another via any known manner of networks and/or a dedicated connection. Each component or device may comprise any number of hardware and/or software elements suitable to provide the functions described herein as well as any other functions. For example, any computing device used in an implementation of a system according to some embodiments may include a processor to execute program code such that the computing device operates as described herein.

All systems and processes discussed herein may be embodied in program code stored on one or more non-transitory computer-readable media. Such media may include, for example, a hard disk, a DVD-ROM, a Flash drive, magnetic tape, and solid-state Random Access Memory (RAM) or Read Only Memory (ROM) storage units. Embodiments are therefore not limited to any specific combination of hardware and software.

Embodiments described herein are solely for the purpose of illustration. Those in the art will recognize other embodiments may be practiced with modifications and alterations to that described above.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06F G06F9/5027

Patent Metadata

Filing Date

July 17, 2024

Publication Date

January 22, 2026

Inventors

Hui LI

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search