US-12566659-B2

Methods, devices, and electronic devices for locating anomaly root causes

PublishedMarch 3, 2026

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

The present disclosure discloses a method, a device, and an electronic device for locating an anomaly root cause. The method for locating the anomaly root cause includes obtaining an anomaly indicator of a target system, obtaining a data item set based on the anomaly indicator, determining an anomaly data item based on data fluctuation information of each data item of the data item set, obtaining an anomaly field corresponding to the anomaly data item based on the anomaly data item, and locating an anomaly root cause of the target system based on the anomaly field.

Patent Claims

Legal claims defining the scope of protection, as filed with the USPTO.

. A method for locating an anomaly root cause, comprising:

. The method of, wherein the obtaining a data item set based on the anomaly indicator includes:

. The method of, wherein the in response to determining that the first sample dispersion is smaller than the second sample dispersion, determining whether the data item is an anomaly data item based on a maximum dispersion of the first sample and a mean deviation of the data items of second sample includes:

. The method of, wherein the obtaining an anomaly field corresponding to the anomaly data item based on the anomaly data item includes:

. The method of, wherein the locating an anomaly root cause of the target system based on the anomaly field includes:

. A device for locating an anomaly root cause, comprising:

. An electronic device, comprising at least one processor, at least one memory, and computer program instructions stored in the at least one memory that, when the computer program instructions are executed by the at least one processor, direct the at least one processor to perform the method of.

. The method of, further including:

. The method of, wherein a determination of the preset time period includes:

. The method of, wherein the determining an anomaly data item based on data fluctuation information of each first sample and each second sample further includes:

. The method of, wherein the determining a comprehensive anomaly degree based on the first anomaly degree and the second anomaly degree includes:

. The method of, wherein the fluctuation prediction model includes a plurality of different specific models, and each specific model refers to a fluctuation prediction model for a particular anomaly indicator.

Detailed Description

Complete technical specification and implementation details from the patent document.

This application is a continuation-in-part of International Application No. PCT/CN2023/101083, filed on Jun. 19, 2023, which claims priority to Chinese Patent Application No. 202211343420.0, filed on Oct. 31, 2022, the entire contents of each of which are hereby incorporated by reference.

The present disclosure relates to the field of intelligent operation and maintenance technology, and in particular, to methods, devices, and electronic devices for locating anomaly root causes.

As the value of data, becomes increasingly evident, the application of the data in various industries is constantly deepening. By visualizing the data, the statuses and trends of business operations are reflected. A critical focus in data utilization involves identifying current business problems through changes in indicator data. A normal interval for an indicator may be set by combining business experience and historical data. An indicator value is monitored and judged to reflect whether the indicator is anomalous, or an anomaly point is identified by a statistical algorithm, a machine learning algorithm, etc., and anomaly data is marked on the indicator by color, etc., to prompt the user to find an anomaly indicator as soon as possible. However, existing method for locating the anomaly indicator is unable to accurately locate a root cause of the anomaly indicator, and requires manual participation, which is time-consuming and laborious.

The main purpose of the present disclosure is to provide a method for locating an anomaly root cause, to solve the technical problem that the current method for locating an anomaly indicator may not accurately locate a root cause of the anomaly indicator.

The present disclosure provide a method for locating an anomaly root cause. The method may include obtaining the anomaly indicator of a target system; obtaining a data item set based on the anomaly indicator; obtaining a first sample and a second sample corresponding to each data item of the data item set based on a time node when the anomaly indicator appears, the first sample being a set of data items corresponding to the data item within a preset time period before the time node, and the second sample being a set of data items corresponding to the data item at the time node and within a preset time period after the time node; determining an anomaly data item based on data fluctuation information of each first sample and each second sample, including determining an average value of data items of the first sample, obtaining a first sample dispersion and a second sample dispersion by determining a dispersion of the first sample and a dispersion of the second sample relative to the average value of the data items, in response to determining that the first sample dispersion is greater than or equal to the second sample dispersion, the data item being a non-anomaly data item, and in response to determining that the first sample dispersion is smaller than the second sample dispersion, determining whether the data item is an anomaly data item based on a maximum dispersion of the first sample and a mean deviation of data items of the second sample; obtaining an anomaly field corresponding to the anomaly data item based on the anomaly data item; and locating an anomaly root cause of the target system based on the anomaly field.

The present disclosure also provides a device for locating an anomaly root cause. The device may include a first obtaining module configured to obtain the anomaly indicator of the target system; a second obtaining module configured to obtain the data item set based on the anomaly indicator; a determination module configured to determine the anomaly data item based on the data fluctuation information of the each data item of the data item set; a third obtaining module configured to obtain the anomaly field corresponding to the anomaly data item based on the anomaly data item; and a localization module configured to locate the anomaly root cause of the target system based on the anomaly field.

The present disclosure further provides an electronic device including at least one processor, at least one memory, and computer program instructions stored in the at least one memory that, when the computer program instructions are executed by the at least one processor, direct the at least one processor to perform the method for locating the anomaly root cause.

In summary, the beneficial effects of the present disclosure are as follows.

The method for locating the anomaly root cause in the present disclosure solves the technical problem that the current method for locating the anomaly indicator may not accurately locate the root cause of the anomaly indicator by obtaining the anomaly indicator of the target system, obtaining the data item set based on the anomaly indicator, determining the anomaly data item based on the data fluctuation information of the each data item of the data item set, obtaining the anomaly field corresponding to the anomaly data item based on the anomaly data item, and locating the anomaly root cause of the target system based on the anomaly field. The method for locating the anomaly root cause of the present disclosure locates the anomaly data item based on the data fluctuation information of the each data item of the data item set obtained from the anomaly indicator, accurately determines the anomaly data item in the data item set through the data fluctuation information, eliminates data items unrelated to the root cause which cause anomaly, narrows a range of the anomaly root cause, and locates the anomaly field based on the anomaly data item, which can facilitate a user to analyze causes for generating the anomaly root cause and take measures in time, so as to improve the operational efficiency of the enterprise. Finally, the method for locating the anomaly root cause locates the anomaly root cause based on the anomaly field, and traces back the anomaly root cause based on the anomaly indicator, which further narrows the range of the anomaly root cause and improves the accuracy of the anomaly root cause.

It should be understood that the specific embodiments described herein are intended only for the purpose of interpreting the present disclosure and are not intended to limit it.

In the prior technology, a common method for locating an anomaly root cause includes setting a normal interval for an indicator and monitoring an indicator value to reflect whether the indicator is anomalous, identifying an anomaly point by a statistical algorithm, a machine learning algorithm, etc., and marking an anomaly indicator by color, etc., to prompt a user to find the anomaly indicator as soon as possible and locating the anomaly root cause through the relevant business personnel to analyze and discuss the anomaly indicator, or locating the anomaly root cause through a data item corresponding to the anomaly indicator. While the common method solves the problem of locating the anomaly root cause to a certain extent, it suffers from the following problems. One data item may correspond to a plurality of fields in a database, including a dimension field and a calculated field. The dimension field is not involved in the calculation of the indicator, and the calculated field may be a single field or a multiple field. If the anomaly root cause is located only by the data item corresponding to the anomaly indicator, a range of the anomaly root cause may become larger, which reduces the locating accuracy of the anomaly root cause.

is a flowchart illustrating an exemplary method for locating an anomaly root cause according to some embodiments of the present disclosure.

As shown in, in order to solve the above technical problems, the present disclosure provides a method for locating the anomaly root cause, which may be executed by a processor (e.g., a processor) of an electronic device. In some embodiments, processincludes the following operations.

In S, an anomaly indicator of a target system is obtained

The target system refers to a system that requires anomaly analysis and/or processing (e.g., locating the anomaly root cause), which may be a variety of computer systems or business systems. For example, the target system may be a banking system, a power system, or the like. In some embodiments, the processor may obtain the anomaly indicator of the target system. The anomaly indicator refers to a data anomaly point whose value exceeds a preset threshold. In the embodiment, the processor sets a threshold corresponding to each indicator in the target system, when the indicator exceeds the corresponding threshold, the target system issues a warning message, and the processor locates the anomaly indicator based on the warning message to obtain the anomaly indicator.

In S, a data item set is obtained based on the anomaly indicator

In some embodiments, in the target system, one indicator may correspond to one or more data items, and the processor may obtain the data item set based on the corresponding anomaly indicator. The data items refer to various types of parameter items preset in the target system based on business requirements. For example, when the data item is a central processing unit (CPU) temperature, the anomaly indicator may be that the CPU temperature exceeds a preset temperature threshold (e.g., at a certain time node t, the CPU temperature exceeds 70° C.), and the data item set corresponding to the anomaly indicator may be {CPU temperature}. The temperature value of the CPU temperature corresponding to a plurality of points in time before the time node t is a set of data items corresponding to the data item.

is a flowchart illustrating an exemplary process for obtaining a data item set according to some embodiments of the present disclosure.

In some embodiments, operation Smay be realized based on process, which may be executed by a processor. As shown in, the processincludes the following operations.

In S, whether an anomaly indicator is a derived indicator is determined.

Specifically, the derived indicator refers to an indicator after a combination operation is performed on a data item, i.e., one derived indicator corresponds to two or more data items. The processor may first determine whether the anomaly indicator is the derived indicator. In response to determining that the anomaly indicator is the derived indicator, the processor may disassemble the anomaly indicator to obtain a data item set corresponding to the anomaly indicator, avoiding the problem of wrongly locating the anomaly root cause due to missing the data item. In response to determining that the anomaly indicator is not the derived indicator, the processor may omit the operation of disassembling the anomaly indicator, thus improving the efficiency of locating the anomaly root cause.

In S, in response to determining that the anomaly indicator is the derived indicator, the data item set is obtained by disassembling the anomaly indicator according to a calculation rule of the anomaly indicator

The anomaly indicator being the derived indicator means that the anomaly indicator is formed by performing combination operation on the two or more data items, and the processor disassemble the anomaly indicator according to the calculation rule of the anomaly indicator to obtain the data item set. For example, according to a combination operation rule, an indicator is calculated from a data item dand a data item d, and if the indicator is the anomaly indicator, the data item set corresponding to the anomaly indicator is {d, d}. For example, for an anomaly indicator that an order transaction failure rate is higher than an expected threshold for a certain cycle, the order transaction failure rate is calculated from a count of orders and failed transactions and a total count of orders, then the corresponding data item set may be {orders with failed transactions, the count of orders}. Here is intended as an example only, and the actual operation rules may be determined by the business rules of the target system.

The anomaly indicator may be caused by one or more data items in the data item set. By disassembling the anomaly indicator to get the data item set, it is possible to obtain all data items related to the anomaly indicator, avoiding the problem of wrongly locating the anomaly root cause due to missing the data item and improving the accuracy of localization.

In S, in response to determining that the anomaly indicator is not the derived indicator, the anomaly indicator is determined as the data item set.

In response to determining that the anomaly indicator is not the derived indicator, i.e., the anomaly indicator is an atomic indicator, then the processor does not need to disassemble the anomaly indicator and designates the anomaly indicator as the data item set. Therefore, the obtained data item set includes only one data item. The above process of omitting the operation of disassembling the anomaly indicator improves the efficiency of locating the anomaly root cause.

In S, an anomaly data item is determined based on data fluctuation information of each data item of the data item set.

Since there may be only one or more anomaly data items in the data item set, the one or more anomaly data items are determined based on the data fluctuation information of each data item of the data item se. Data items unrelated to the root cause which cause the anomaly are eliminated, which further narrows the range of the anomaly root cause and improves the accuracy of locating the anomaly root cause.

In some embodiments, the data fluctuation information may be determined by a Pearson correlation coefficient. The processor selects a preset count of data items before or after a time node when the anomaly indicator appears as a data sample, substitutes each data item and the anomaly indicator in the data sample into a Pearson correlation coefficient calculation formula to obtain a correlation value between each data item and the anomaly indicator in the data sample. Then, the processor ranks the data items according to correlation values, and a data item with a largest correlation value is the anomaly data item, where the Pearson correlation coefficient is prior art and will not be repeated herein.

is a flowchart illustrating an exemplary process for determining an anomaly data item based on data fluctuation information according to some embodiments of the present disclosure.

As some optional embodiments of the present disclosure, operation Smay be realized based on process, which may be executed by a processor. As shown in, the processincludes the following operations.

In S, a first sample and a second sample corresponding to each data item of a data item set are obtained based on a time node when an anomaly indicator appears, where the first sample is a set of data items corresponding to the data item within a preset time period before the time node, and the second sample is a set of data items corresponding to the data item at the time node and within a preset time period after the time node.

In this embodiment, the processor first obtains the time node when the anomaly indicator appears, and the time node is obtained by a warning message of the anomaly indicator. After obtaining the time node, for each data item of the data item set, the processor obtains the corresponding first sample and the corresponding second sample. The first sample is the set of data items corresponding to the data item within the preset time period before the time node, and the second sample is the set of data items corresponding to the data item at the time node and within the preset time period after the time node.

The preset time period is determined based on the target system. Merely by way of example and not as a limitation, the preset time period may be 10 minutes, one hour, one day, one week, etc. The processor denotes the first sample as T{x, x, . . . , x}, and notates the second sample as S{y, y, . . . , y}, where ydenotes the data item corresponding to the time node when the anomaly indicator appears, and n denotes a count of data items.

The preset time period is determined based on the target system. When the target system is a power system, a banking system, or other systems with large data acquisition volumes, the preset time period may be set in a unit of minutes, to avoid the problem that the efficiency in locating the anomaly root cause is decreased due to an excessively large sample volume. When the target system is a system with a smaller data acquisition volume, the preset time period may be set in a unit of hour or other larger time units, to avoid the problem that the data fluctuation information is inaccurate due to an excessively small sample volume, which causes the problem in wrongly locating the anomaly root cause.

In some embodiments, the processor determines the preset time period based on a data item characteristic of each data item and a system characteristic of the target system.

The data item characteristic of each data item includes at least one of a data acquisition volume, a data updating frequency, or an updating heat cycle of each data item. The system characteristic of the target system includes at least one of a data acquisition volume, a system load, or a network load of the target system.

The data item characteristic may be obtained through data monitoring and statistics (e.g., database statistics) on each data item. The data acquisition volume of each data item characterizes the average data volume (e.g., data flow) of the data item in a preset unit of time (e.g., 1 minute). The data updating frequency characterizes the frequency of updating (e.g., reading and/or writing) data in the preset unit of time. The updating heat cycle includes an updating heat (e.g., an updating frequency) in different time periods within a cycle, and the cycle may be a day, a month, etc., which may be determined based on the historical data corresponding to the data item.

The system characteristic of the target system is used to reflect an operational status of the target system, which may be obtained by monitoring various operational parameters of the target system. For example, the data acquisition volume may be obtained by monitoring a data throughput volume, the system load may be obtained by monitoring resource utilization (e.g., a memory occupation, a CPU usage, a disk capacity, etc.) of the target system, and the network load may be obtained by monitoring network (e.g., a network request volume).

The data item characteristic and the system characteristic may comprehensively reflect a state of each data item and/or the target system when the anomaly indicator appears, which may be stored in forms of a log, a data table, etc., and used for subsequent analysis and processing (e.g., tracking, locating, processing, etc., an anomaly or the anomaly root cause). In some embodiments, the processor may perform a time period matching process in an anomaly localization data table based on the data item characteristic and the system characteristic to determine the preset time period.

The anomaly localization data table includes a plurality of anomaly localization records in a historical time period (e.g., the past year, the past half year, etc.). An anomaly localization record may include a historical anomaly time point when a historical anomaly indicator appeared, a historical data item characteristic of each data item, and a historical system characteristic of the target system. The anomaly localization record also includes a historical preset time period when locating and processing a historical anomaly root cause, a historical anomaly localization result (e.g., the anomaly root cause), and a failure rate after a historical anomaly localization.

The failure rate after the historical anomaly localization is used to reflect the appearance frequency of anomalies in the target system after a historical anomaly has been repaired, e.g., the appearance count of the anomalies within a preset monitoring cycle (e.g., a day, a week). The more accurate the anomaly localization result, the higher the efficiency and accuracy of repairing an anomaly, and the lower the subsequent failure rate.

The processor may match the target anomaly localization record in the anomaly localization data table by the time period matching process. The target anomaly localization record may be an anomaly localization record that has a similarity of less than a threshold to a current data item characteristic and system characteristic and has a lowest failure rate after the historical anomaly localization. The processor may designate the historical preset time period corresponding to the target anomaly localization record as the preset time period. The similarity may be determined based on a vector distance (e.g., a Euclidean distance, etc.), etc.

In some embodiments of the present disclosure, considering the complexity of anomaly tracking, the preset time period is determined by comprehensively considering the current data item characteristic and the system characteristic, to make locating and processing of the subsequent anomaly root cause more comprehensive and more in line with the actual situation, thus enhancing the accuracy of an anomaly root cause localization result.

In S, the anomaly data item is determined based on data fluctuation information of each first sample and each second sample.

In some embodiments, the processor obtains the data fluctuation information of the first sample and the second sample separately. Since the second sample includes a data item corresponding to the anomaly indicator, if there is an anomaly data item in the second sample, the data fluctuation information of the second sample and the data fluctuation information of the first sample may differ. The processor compares the data fluctuation information of the first sample and the data fluctuation information of the second sample of each data item of the data item set, to determine the anomaly data item in the data item set. In contrast to prior art that determines the anomaly data item by an influence factor obtained only from the difference between a data item and a preset threshold, the present embodiment determines the anomaly data item based on the data fluctuation information of the first sample and the data fluctuation information of the second sample. The present embodiment avoids a situation in which the preset threshold is set incorrectly, and determines the anomaly data item more accurately, thereby improving the accuracy of the localization of the anomaly root cause. Furthermore, determining the anomaly data item in the data item set based on the data fluctuation information eliminates the data item that is not related to the root cause that causes anomaly, and narrows the range of the anomaly root cause.

As some optional embodiments of the present disclosure, the operation Sincludes the following operations.

In S, an average value of data items of the first sample is determined.

In some embodiments, the average value of the data items of the first sample is determined according to the following formula (1):

Patent Metadata

Filing Date

Unknown

Publication Date

March 3, 2026

Inventors

Unknown

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search