US-12602471-B2

Anomaly detection system

PublishedApril 14, 2026

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

A system includes a computer. The computer includes a processor and a memory. The memory includes instructions such that the processor is programmed to: construct a short-term graph based on data representing one or more primary events, update a long-term graph to include elements from the short-term graph when a numerical representation of the short-term graph deviates from a graph profile, and determine whether a maliciousness probability of the long-term graph exceeds an anomaly threshold.

Patent Claims

Legal claims defining the scope of protection, as filed with the USPTO.

. A system comprising a computer including a processor and a memory, the memory including instructions such that the processor is programmed to:

. The system of, wherein the processor is further programmed to construct the short-term graph within the first data structure based on parsed log data representing primary events occurring within a predefined time window within the communication network.

. The system of, wherein the processor is further programmed to receive log data from at least one endpoint monitoring agent.

. The system of, wherein the processor is further programmed to calculate the updated maliciousness probability for the clusters according to P_new=(P_cur·N_cur+p_i·n_i+0.5·O·(P_cur+p_i))/(N_cur+n_i+O), where P_cur is current cluster maliciousness probability, N_cur is number of nodes in current cluster, p_i is probability of new cluster, n_i is number of nodes in new cluster, and O is number of overlapping nodes.

. A method comprising:

. The method of, further comprising constructing the short-term graph within the first data structure based on parsed log data representing primary events occurring within a predefined time window within the communication network.

. The method of, the method further comprising receiving log data from at least one endpoint monitoring agent.

. A system comprising a computer including a processor and a memory, the memory including instructions such that the processor is programmed to:

. The system of, wherein the processor is further programmed to receive log data from at least one endpoint monitoring agent.

Detailed Description

Complete technical specification and implementation details from the patent document.

Computer networks may include multiple computing assets that enable users to access shared resources including a variety of digital content accessible by a communication network. A computer network can be a set of computers connected to form one or more nodes within a personal area network, a local/virtual area network, a wide area network, or any other type of network architecture associated with a col lection of computing devices. Access to the Internet external to a particular network presents a variety of cyber security challenges. As such, computing assets within an example computer network may be susceptible to data breaches or attacks from malicious users seeking unauthorized access to one or more assets within the network.

In other features, the processor is further programmed to generate an alert when the maliciousness probability exceeds the anomaly threshold.

In other features, the data representing one or more primary events comprises parsed log data.

In other features, the processor is further programmed to receive log data from at least one endpoint monitoring agent.

In other features, the processor is further programmed to determine whether the numerical representation of the short-term graph deviates from the graph profile using a machine learning module.

In other features, the machine learning module comprises a plurality of machine learning models configured as a machine learning ensemble.

A method comprises constructing a short-term graph based on data representing one or more primary events, updating a long-term graph to include elements from the short-term graph when a numerical representation of the short-term graph deviates from a graph profile, and determining whether a maliciousness probability of the long-term graph exceeds an anomaly threshold.

In other features, the method includes generating an alert when the maliciousness probability exceeds the anomaly threshold.

In other features, the data representing one or more primary events comprises parsed log data.

In other features, the method includes receiving log data from at least one endpoint monitoring agent.

In other features, determining whether the numerical representation of the short-term graph deviates from the graph profile using a machine learning module.

In other features, the machine learning module comprises a plurality of machine learning models configured as a machine learning ensemble.

A system includes a computer. The computer includes a processor and a memory. The memory includes instructions such that the processor is programmed to: construct a plurality of short-term graphs based on data representing one or more primary events, at least one of create or update a long-term graph to include elements from the short-term graphs when a numerical representation of the short-term graphs deviates from a learnt graph profile, and determine whether a maliciousness probability of any subset of the long-term graph exceeds an anomaly threshold.

In other features, the processor is further programmed to generate an alert when the maliciousness probability exceeds the anomaly threshold.

In other features, the processor is further programmed to update the maliciousness probabilities of at least one subset of the long-term graph.

In other features, the data representing one or more primary events comprises parsed log data.

In other features, the processor is further programmed to receive log data from at least one endpoint monitoring agent.

In other features, the processor is further programmed to determine whether the numerical representation of the short-term graph deviates from the graph profile using a machine learning module.

In other features, the machine learning module comprises a plurality of machine learning models configured as a machine learning ensemble.

In other features, the processor is further programmed to determine the numerical representation of the short-term graphs based on at least one of a betweenness centrality, a closeness centrality, an Eigenvector centrality, an edge connectivity, a node connectivity, a number of communities, a community size distribution, Node2vec embeddings, Deepwalk embeddings, or Deep Neural Networks for Learning Graph Representations (DNGR) embeddings.

The present disclosure describes an anomaly detection system that provides cyber threat detection functionality. Typically, cyber threat detection systems incorporate signature-based analysis and/or rules-based analysis. As described herein, a system can extract primary events from multiple sources, such as log files, Next Generation Firewalls (NGFW), intrusion detection systems (IDSs), endpoint detection and response (EDR) systems, application programming interfaces (APIs), cloud computing devices, or the like. The system can perform signatureless correlations of the primary events to detect anomalous activity, such as malicious activity. For example, the system can use graph techniques to perform signatureless correlation of primary events that can result in actionable alerts. A graph can comprise a collection of nodes and edges in which the edges represent relationships between the nodes.

illustrates an example environmentthat includes a set of user devices(referred to collectively as “endpoints” and individually as “endpoint”), a set of server devices(referred to collectively as “server devices” and individually as “server device”), an anomaly detection manager, and a network. Devices of environmentmay interconnect via wired connections, wireless connections, or a combination of wired and wireless connections.

The endpointincludes one or more devices capable of receiving, generating, storing, processing, and/or providing information associated with an account and/or a transaction for which the account is to be used. For example, the endpointmay include a desktop computer, a mobile phone, a laptop computer, a tablet computer, a handheld computer, a gaming device, a wearable communication device, e.g., a smart wristwatch, a pair of smart eyeglasses, etc., or a similar type of device.

The server deviceincludes one or more devices capable of receiving, providing, storing, processing, and/or generating information associated with an account and/or a transaction for which the account is to be used. For example, the server devicemay include a server (e.g., in a data center or a cloud computing environment), a data center (e.g., a multi-server micro data center), a workstation computer, a virtual machine (VM) provided in a cloud computing environment, or a similar type of device. In some implementations, the server devicemay include a communication interface that allows the server deviceto receive information from and/or transmit information to other devices in environment.

The anomaly detection managerincludes a computing system of one or more devices capable of processing information from and/or transmitting information to the endpoints, as described in greater detail below. In an example implementation, as shown in, the server deviceincludes the anomaly detection manager. In some examples, the server devicemay comprise a cloud server or a group of cloud servers. In some implementations, the anomaly detection managermay be designed to be modular, such that certain software components can be swapped in or out depending on a particular need.

In various implementations, the anomaly detection managercommunicates with an endpoint monitoring agentresiding on the endpoints. The endpoint monitoring agentcomprises executable software that generates and/or monitors log data and/or files. The generated log data can include certain parameters or attributes associated with security and non-security related events and activities that occur within one or more communication networks, such as the network. As discussed in greater detail below, the log data and/or log files can be parsed into primary events that are used to generate graph elements. The log data and/or log files can comprise, but is not limited to, Domain Name System (DNS) traffic, cloud access security broker (CASB) data, Next Generation Firewalls (NGFW) data, intrusion detection system (IDSs) data, endpoint detection and response (EDR) system data, or the like.

The networkincludes one or more wired and/or wireless networks. For example, the networkmay include a cellular network (e.g., a long-term evolution (LTE) network, a code division multiple access (CDMA) network, a 3G network, a 4G network, a 5G network, another type of cellular network, etc.), a public land mobile network (PLMN), a local area network (LAN), a wide area network (WAN), a metropolitan area network (MAN), a telephone network (e.g., the Public Switched Telephone Network (PSTN)), a private network, an ad hoc network, an intranet, the Internet, a fiber optic-based network, a cloud computing network, and/or the like, and/or a combination of these or other types of networks.

is a diagram of example components of a device. The devicemay correspond to the endpoint, the server device. In some implementations, the endpointand/or the server device, may include one or more devicesand/or one or more components of the device. As shown in, the devicemay include a bus, a processor, a memory, a storage component, an input component, an output component, and a communication interface.

The busincludes a component that permits communication among the components of device. The processoris implemented in hardware, firmware, or a combination of hardware and software. The processoris a central processing unit (CPU), a graphics processing unit (GPU), an accelerated processing unit (APU), a microprocessor, a microcontroller, a digital signal processor (DSP), a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), or another type of processing component. In some implementations, the processorincludes one or more processors capable of being programmed to perform a function. The memoryincludes a random-access memory (RAM), a read only memory (ROM), and/or another type of dynamic or static storage device (e.g., a flash memory, a magnetic memory, and/or an optical memory) that stores information and/or instructions for use by the processor.

The storage componentstores information and/or software related to the operation and use of the device. For example, the storage componentmay include a hard disk (e.g., a magnetic disk, an optical disk, a magneto-optic disk, and/or a solid-state disk), a compact disc (CD), a digital versatile disc (DVD), a floppy disk, a cartridge, a magnetic tape, and/or another type of non-transitory computer-readable medium, along with a corresponding drive.

The input componentincludes a component that permits the deviceto receive information, such as via user input (e.g., a touch screen display, a keyboard, a keypad, a mouse, a button, a switch, and/or a microphone). Additionally or alternatively, the input componentmay include a sensor for sensing information (e.g., a global positioning system (GPS) component, an accelerometer, a gyroscope, and/or an actuator). The output componentincludes a component that provides output information from the device(e.g., a display, a speaker, and/or one or more light-emitting diodes (LEDs)).

The communication interfaceincludes a transceiver-like component (e.g., a transceiver and/or a separate receiver and transmitter) that enables the deviceto communicate with other devices, such as via a wired connection, a wireless connection, or a combination of wired and wireless connections. The communication interfacemay permit the deviceto receive information from another device and/or provide information to another device. For example, the communication interfacemay include an Ethernet interface, an optical interface, a coaxial interface, an infrared interface, a radio frequency (RF) interface, a universal serial bus (USB) interface, a Wi-Fi interface, a cellular network interface, or the like.

The devicemay perform one or more processes described herein. The devicemay perform these processes based on the processorexecuting software instructions stored by a non-transitory computer-readable medium, such as the memoryand/or the storage component. A computer-readable medium is defined herein as a non-transitory memory device. A memory device includes memory space within a single physical storage device or memory space spread across multiple physical storage devices.

Software instructions may be read into the memoryand/or the storage componentfrom another computer-readable medium or from another device via communication interface. When executed, software instructions stored in the memoryand/or the storage componentmay cause the processorto perform one or more processes described herein. Additionally, or alternatively, hardwired circuitry may be used in place of or in combination with software instructions to perform one or more processes described herein. Thus, implementations described herein are not limited to any specific combination of hardware circuitry and software.

The number and arrangement of components shown inare provided as an example. In practice, the devicemay include additional components, fewer components, different components, or differently arranged components than those shown in. Additionally, or alternatively, a set of components (e.g., one or more components) of the devicemay perform one or more functions described as being performed by another set of components of the device.

illustrates example log dataobtained from one or more endpoint monitoring agents, example primary eventsparsed from the log data, example graph elementsgenerated based on corresponding primary events, and example short-term graphsgenerated based on the graph elements. In various implementations, the log datais received at the anomaly detection manager. The anomaly detection managercan store the received log datain a data structure, such as a NoSQL database. The anomaly detection managerparses the log datato generate the primary eventsusing suitable parsing techniques. The parsed log datais used to define the primary eventsinvolving one or more entities. The anomaly detection managercan generate one or more graph elementsbased on one or more relationships, i.e., links, between entities, and the graph elementscan be used to construct short-term graphsas shown in. The short-term graphscan further be stored within a data structure, such as a graph database.

As used herein, a “short-term” graph can be comprised of primary events collected over a predefined time period, e.g., thirty minutes, one hour, two hours, four hours, etc., and a “long-term” graph can be comprised of graph elements from “short-term” graphs.

Each graph can comprise multiple nodesthat can be connected by an edge. Within the present disclosure, entities monitored within a communication network, such as network, are represented as nodesand events, e.g., actions, between entities can comprise edges.

As shown in, a short-term graphcan be constructed from a collection of nodesconnected by edgesin which a particular nodecan have one or more relationships with other nodes. For example, a short-term graphcan be generated based on a relationship between the node-representing “user,” the node-representing “device ID1” and the node-“URL2.” As shown, nodes-,-are connected by edge-, and nodes-,-are connected by edge-.

illustrates an example environmentfor detecting anomalies. As shown, the anomaly detection managerclusters one or more short-term graphsaccording to the primary eventsextracted from the log data. Further, the anomaly detection managercan determine one or more numerical representations, i.e., vector representations,of the short-term graphs. In an example implementation, the anomaly detection managercan determine the numerical representationsof the short-term graphsbased on a betweenness centrality, a closeness centrality, an Eigenvector centrality, an edge connectivity, a node connectivity, a number of communities, a community size distribution, Node2vec embeddings, Deepwalk embeddings, Deep Neural Networks for Learning Graph Representations (DNGR) embeddings, or the like.

The numerical representationsare provided to a machine learning modulethat identifies outliers, i.e., anomalous network behavior. Within the present context, the short-term graphscorresponding to the numerical representationsidentified as deviating from a predetermined graph profile can comprise secondary signals indicative of anomalous network behavior. In an example implementations, the machine learning modulecomprises a machine learning ensemblethat includes multiple machine learning models,,. While the machine learning ensembleis illustrated as including machine learning models,,, it is understood that the machine learning ensemblecan include additional or fewer machine learning models,,.

The machine learning moduledetermines whether a particular numerical representationof a short-term graphdeviates with respect to one or more predefined graph profiles, i.e., a learnt graph profile. For example, the machine learning moduleuses the machine learning ensembleto determine whether the deviation between the numerical representationsand a predetermined graph profile is greater than a predetermined deviation threshold. The machine learning ensemblemay determine the deviation using machine learning models,,, which can include, but are not limited to, decision trees, support vector machines, Boltzmann machines, restricted Boltzmann machines, autoencoders, isolation forests, deep support vector data descriptions, and/or clustering algorithms.

The machine learning ensemblemay be a repository of machine learning engines, which can comprise a hybrid engine, a homogenous engine, or a heterogeneous engine. The machine learning ensembleis homogenous where the individual machine learning models that make up the ensemble are of the same type. The machine learning ensembleis heterogeneous where the individual machine learning models that make of the ensemble are of different types.

As shown, the machine learning ensemblealso includes a majority voting enginethat determines whether the short-term graphdeviates from the one or more predetermined graph profiles based on the output of the machine learning models,,. The one or more predetermined graph profiles may be generated based on defined non-malicious activities within a communication network.

The anomaly detection managercan calculate a probability of the short-term graphbeing malicious. In an example implementation, the anomaly detection managercan calculate the probability of the short-term graphbeing malicious based on the amount of deviation of the short-term graphwith respect to the one or more graph profiles.

illustrates an event in which the long-term graphand the short-term graphhave some elements in common. For example, the anomaly detection managercan add elements of the short-term graphwith a long-term graph. As shown, nodes “D,” “E,” and “G” are common, i.e., overlap, to graphs,. Referring to, based on the overlap, the anomaly detection managercombines elements of the short-term graphaccordingly to form a merged long-term graph. The anomaly detection managercan then calculate a probability of the combined elements of the long-term graphas being malicious. In an example implementation, the anomaly detection managercan calculate the maliciousness probability according to Equation 1:

where Pnew is an updated maliciousness probability, Pcur is the probability of the current cluster to be malicious, Ncur is the number of nodes in the current cluster, pi is the probability of the new cluster to be malicious, ni is the number of nodes in the new cluster, and O is the number of nodes common to both current and new clusters. The updated probability associated with modified elements, or a cluster, is stored in a data structure. As discussed herein, the anomaly detection managercan generate an alert when the updated maliciousness probability exceeds a predefined anomaly threshold.

If there are no common nodes, elements of the short-term graphcan be added directly to the long-term graph. In these instances, the probability associated with added elements, or cluster, is stored in a data structure.

Patent Metadata

Filing Date

Unknown

Publication Date

April 14, 2026

Inventors

Unknown

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search