A computer-implemented method for training a neural network, wherein the neural network is an invertible neural network configured for accepting a sensor signal as input. The method includes: obtaining a first sensor signal from a dataset; sampling a value from a latent space of the neural network, wherein the value is sampled from a predefined hypervolume in the latent space; determining a second sensor signal from the sampled value by inversely mapping the sampled value through the neural network; determining a first latent representation by forward mapping the first sensor signal through the neural network; determining a second latent representation by forward mapping the second sensor signal through the neural network; determining a loss value from a loss function; and training the neural network based on a negative gradient of the loss value.
Legal claims defining the scope of protection, as filed with the USPTO.
. A computer-implemented method for training a neural network, wherein the neural network is an invertible neural network configured for accepting a sensor signal as input, the method comprising the following the steps:
. The method according to, wherein the center of hypervolume is located at an origin of the latent space.
. The method according to, wherein the negative gradient is determined based on an automatic differentiation method and wherein the second sensor signal is further detached from the computational graph of the automatic differentiation method before being forward mapped through the neural network.
. The method according to, wherein the hypervolume is a hypersphere or a hypercube or a hyperrectangle.
. The method according to, wherein the first term and/or the second term is: a Euclidean distance or a cosine similarity or function expressing a maximum of a dimension-wise distance or a sum of fourth powers of a dimension-wise distance.
. A computer-implemented method for determining whether a provided sensor signal is normal or anomalous, the method comprising:
. The method according to, wherein the threshold characterizes a border of the hypervolume.
. The method according to, wherein the provided sensor signal is of a sensor observing an environment of a technical system machine and/or observing the technical system itself, the method further comprising:
. A training system configured to train a neural network, wherein the neural network is an invertible neural network configured for accepting a sensor signal as input, the training system configured to:
. A control system configured to determine a control signal for a technical system, the control system configured to:
. A non-transitory machine-readable storage medium on which is stored a computer program for training a neural network, wherein the neural network is an invertible neural network configured for accepting a sensor signal as input, the computer program, when executed by a processor, causing the processor to perform the following the steps:
Complete technical specification and implementation details from the patent document.
The present application claims the benefit under 35 U.S.C. § 119 of European Patent Application No. EP 24 17 7124.5 filed on May 21, 2024, which is expressly incorporated herein by reference in its entirety.
The present invention concerns a computer-implemented method for training a neural network, a computer-implemented method for determining, whether a supplied sensor signal is anomalous or not, a training system, a control system, a computer program, and a machine-readable storage medium.
European Patent No. EP 3 975 011 describes a computer-implemented method for training a self-normalizing flow.
Teshima et al. “Coupling-based Invertible Neural Networks are Universal Diffeomorphism Approximators”, 2020, arxiv.org/pdf/2006.11469 describes that a CF-INN is universal if its layers contain affine coupling and invertible linear functions as special cases.
Many modern systems rely on perceiving their environment or their own functionality through suitable sensors. The received sensor signals are often-times processed by statistical methods to extract useful information. Especially machine learning systems are frequently employed for processing sensor signals, e.g., for classifying the content of such sensor signal or for performing a regression analysis based on the sensor signals. The results of the machine learning systems are typically a vital building block for automated systems. For example, in robotics, the perceived environment serves as detected through a machine learning system may serve as proxy for planning actions in the real world. Likewise, the operation of a machine may depend on a self-perception of the machine. For example, an automated device may receive sensor signals about its own temperature, pressure, certain speeds of the device or elements of the device (e.g., wheels or bearings), noises originating from the device or vibrations originating from the device. The operation of the device may then be dependent on a classification of the received sensor signals, e.g., if the sensor signals characterize a normal operation.
Whether perceiving an environment or performing self-assessment as described above, it is paramount for sensor signals processed by automated devices to be analyzed with respect to whether a respective sensor signal represents an outlier, i.e., a sensor measurement that is not typically measured during operation of the sensor. Such signals may hint at failures of the sensing system as well as sensor signals, for which a machine learning system processing the sensor signals has not been trained during its training phase. Typically, such sensor signals lead to an unpredictable and/or diminishing performance of the respective machine learning system.
Advantageously, the method for training according to the present invention presented herein allows for training a neural network to reliably detect whether a supplied sensor signal characterizes an outlier with respect to a known dataset of outliers. The method for training of the present invention advantageously results in a neural network that is configured for anomaly detection. This neural network may then be used in any of the above-described cases in order to detect anomalies during inference time allowing for suitable measures to be taken in case an anomaly is detected (e.g., hailing a human operator).
In a first aspect, the present invention concerns a computer-implemented method for training a neural network, wherein the neural network is an invertible neural network configured for accepting a sensor signal as input. According to an example embodiment of the present invention, the method includes the following steps:
By construction, the neural network is configured for anomaly detection after training as the training method can be understood to train the neural network for one-class classification: Latent representations of samples (i.e., sensor signals) from the dataset are pulled close to the center of the hypervolume while other samples (i.e., latent representations of sensor signals not in the dataset) are pushed outside of the hypervolume. This way, the neural network learns to map samples from the dataset into the hypervolume while other samples are mapped to the outside of the hypervolume.
This may also be understood as samples from the dataset constituting the one class from a one-class classification problem. This one-class is mapped into the hyper sphere or hyper cube and all other classes are mapped to the outside.
“Sampling a value” may preferably be achieved by obtaining a random sample from the hypervolume. However, deterministic sampling strategies are also possible (e.g., selecting certain landmark points in the hypervolume or selecting points from a previously defined list of points). The term “value” may especially be understood as a non-scalar representation of data, e.g., a vector, a matrix, or a tensor. The “value” may hence be understood as having a dimensionality of the latent space.
The latent space may be understood as the space of output of the neural network. In other words, the neural network produces outputs of a predefined mathematical structure, e.g., a vector, a matrix, or a tensor. This output “lives” in a certain space of dimensionality equal to the dimensionality of the output of the neural network.
Alternatively, the latent space may be understood as the output of an intermediate layer of the neural network (i.e., a hidden layer of the neural network). As the neural network is invertible, any intermediate representation is also invertible making such intermediate representations also suitable to be used in the method.
The term “invertible neural networks” is understood to comprise both neural networks that provide for an approximate bijective mapping (e.g., approximated normalizing flows such as self-normalizing flows) as well as neural networks that provide for an exact bijective mapping (e.g., normalizing flows, invertible residual networks).
In all example embodiments of the method of the present invention, the hypervolume may be of an arbitrary shape but preferably an at least weakly monotonic with respect to the center and compact distribution in the hyper space. Preferred embodiments of the hypervolume are a hypersphere, a hypercube, or a hyperrectangle.
The term “difference” may be understood as a distance or any other suitable measure of similarity, e.g., a cosine similarity.
In all example embodiments of the method of the present invention, the center may be a center of mass of the hypervolume, a geometric center, or any other point in the hypervolume that represents an origin of the hypervolume.
The term “negative difference” may be understood as multiplying an obtained distance or similarity with −1 in order to obtain a value indicating the distance or similarity in a negative direction.
“Training the neural network based on a negative gradient” may be understood as using the negative gradient in a gradient descent method for training the neural network. Common methods such as Adam or SGD may be used.
In all example embodiments of the method of the present invention, the term sensor signal may be understood as a digital measurement result of a sensor. The sensor may perceive an environment of a technical system and/or the technical system itself (e.g., certain physical properties of the technical system such as temperature, pressure, speed, acceleration, rotation, vibrations, power consumption, electrical current, amperage, or the like). The sensor may especially be an optical sensor with the sensor signal characterizing a measurement of the optical sensor, i.e., an image. The optical sensor may especially be a camera, a lidar, a radar, an ultrasonic sensor, or a thermal camera. The neural network using the sensor signal as input may especially be understood as the neural network operating on the low-level features of the sensor signal as input. For images, for example, the neural network may especially receive the pixel values as input and perform its computations based on these pixel values. For, e.g., audio data, the sensor signal may represent a spectrum of an audio signal or quantized samples of the audio signal. The neural network may hence operate on these values respectively.
The sensor signal may also be a time series of individual sensor readings from a predefined sensor. The sensor in this case may especially be a sensor for measuring as temperature, pressure, speed, acceleration, rotation, vibrations, power consumption, electrical current, amperage, or the like.
It is understood that the term “sensor signal” also covers representations of a raw sensor signal. For example, if a raw sensor signal is provided to a pre-processing function and the neural network is provided the pre-processed result of the sensor signal, this is still understood as “the neural network receiving the sensor signal as input”.
Advantageously, according to an example embodiment of the present invention, the neural network is configured during training to be able to perform one-class classification. The one class is represented by sensor signals from the training dataset. During inference, sensor signals can advantageously be flagged as anomalous if they fall outside of the hypervolume and/or on the surface of the hyper volume.
Even more advantageously, the inventors found that the training method improves an anomaly detection performance of the neural network. In particular, the training method leads to an increased anomaly detection performance of a coupling-based invertible neural network evaluated on the CIFAR10 dataset compared to a standard coupling-based invertible neural network. The increased performance is especially present for the coupling-based invertible neural network as described by Teshima et al.
The center of hypervolume may preferably be located at the origin of the latent space.
Advantageously, this saves the method of the present invention from performing computations as the respective differences computed in the method default to the first latent representation and second latent representation respectively. This leads to a reduction of necessary computational resources.
In preferred embodiments of the present invention, the negative gradient is determined based on an automatic differentiation method and wherein the second sensor signal is further detached from the computational graph of the automatic differentiation method before being forward mapped through the neural network.
Advantageously, detaching the second sensor signal from the computational graph results in an even further increased performance as without detaching, the gradients of the sampled value that is to be pushed out of the hyper volume would cancel out in the training process and anomalous samples would remain in the hypervolume.
In preferred embodiments of the present invention, the first term and/or the second term is a Euclidean distance or a cosine similarity or function expressing a maximum of a dimension-wise distance or a sum of the fourth powers of a dimension-wise distance.
The inventors found these measures of difference to lead to the biggest performance increase of the method.
In another aspect, the present invention concerns a computer-implemented method for determining whether a provided sensor signal is normal or anomalous.
According to an example embodiment of the present invention, the method includes:
This method can be understood as the inference method corresponding to the training method of the present invention as presented above. The inference method of the present invention inherits all its advantages from the training method.
Obtaining the neural network may be understood as performing the steps of the training method as part of the method for determining whether a provided sensor signal is normal or anomalous. Alternatively, it may also be understood as obtaining a neural network that has been trained with a training method as presented above, e.g., by downloading it from the internet or receiving it through online training platforms.
Preferably, the threshold in the inference method characterizes a border of the hypervolume of training method. In other words, the edge of the previously used hypervolume is used for deciding whether a provided sensor signal is normal (inside the hypervolume) or anomalous (outside the hypervolume). The edge itself may be considered normal or anomalous depending on a user's choice of (i.e., is a hyperparameter of the method).
In another aspect, the present invention concerns a method for determining a control signal for a machine, wherein a sensor signal of a sensor observing an environment of the machine and/or the machine itself is provided to a method for determining whether the sensor signal is anomalous or not in accordance with the present invention as presented above, wherein the control signal is determined according to the determination of the sensor signal as anomalous or normal.
Advantageously, this allows for increase in controllability of the machine as anomalous sensor signals responsible for a behavior of the machine are detect more reliably. This increases the safety of the machine.
Example embodiments of the present invention will be discussed with reference to the figures in more detail.
shows an embodiment of a hypervolume (h) in a latent space(s). In the embodiment, the hypervolume (h) is a hypercube, preferably located at an origin of the latent space(s). When running the training method as described above, the neural network determines first latent representations (z) and second latent representations (z). The first latent representations (z) correspond to sensor signals from a training dataset while the second latent representations (z) correspond to other sensor signals, i.e., sensor signals not in the training dataset. During training, the first latent representations (z) are pushed towards a center (c) of the hypervolume, while the second latent representations (z) are being pushed away from the center.
depicts a neural network () during training of the neural network (). The neural network () is an invertible neural network, e.g., an invertible residual neural network.
The neural network () is provided a sensor signal (x) for which the neural network () determines a first latent representation (z) in a forward mapping operation.
A value ({tilde over (z)}) from within the hypervolume (h) the latent space is determined. Preferably, the value ({tilde over (z)}) is sampled at random. The value ({tilde over (z)}) is then propagated through the neural network () in a backward mapping operation, resulting in a second sensor signal ({tilde over (x)}). When using automatic differentiation methods (e.g., autodiff) for training the neural network (), the resulting second sensor signal ({tilde over (x)}) is preferably detached from the computational graph in a detaching operation (). The resulting of detaching is the same second sensor signal ({tilde over (x)}), however, the second sensor signal ({tilde over (x)}) is detached from gradient computation and hence has no negative impact on the training method.
The second sensor signal ({tilde over (x)}) is the propagated through the neural network () in a forward mapping operation in order to determine a second latent representation (z). The second latent representation () and the first latent representation (z) are then forwarded to a loss function () in order to determine a loss value (l). The loss function () comprises a first term that characterizes a difference of the first latent representation (z) and the center (c) of the hypervolume (h) and a second term that characterizes a negative difference of the second latent representation (z) and the center (c).
shows an embodiment of a training system () for training the neural network () of the control system () by means of a training data set (T). The training data set (T) comprises a plurality of sensor signals (x) which are used for training the neural network (), wherein the training data set (T) further comprises, for each input signal (x), a desired output signal (t) which corresponds to the input signal (x) and characterizes a classification of the input signal (x).
For training, a training data unit () accesses a computer-implemented database (St), the database (St) providing the training data set (T). The training data unit () determines from the training data set (T) preferably randomly at least one sensor signal (x) and transmits the sensor signal (x) to the neural network (). In a combined step (), the neural network () determines a first latent representation (z) and a second latent representation (z) based on the sensor signal (x), wherein a loss value (l) is determined from the first latent representation (z) and second latent representation (z).
The loss value (l) transmitted to a modification unit ().
The modification unit () determines new parameters (Φ′) of the neural network () based on the loss value. In the given embodiment, this is done using a gradient descent method, preferably stochastic gradient descent, Adam, or AdamW. In further embodiments, training may also be based on an evolutionary algorithm or a second-order method for training neural networks.
In other preferred embodiments, the described training is repeated iteratively for a predefined number of iteration steps or repeated iteratively until the first loss value falls below a predefined threshold value. Alternatively, or additionally, it is also possible that the training is terminated when an average first loss value with respect to a test or validation data set falls below a predefined threshold value. In at least one of the iterations the new parameters (Φ′) determined in a previous iteration are used as parameters (Φ) of the neural network ().
Furthermore, the training system () may comprise at least one processor () and at least one machine-readable storage medium () containing instructions which, when executed by the processor (), cause the training system () to execute a training method according to one of the aspects of the present invention.
shows an embodiment of a control system () determining a control signal (A) for an actuator () or a display () based on an output of the trained neural network (). The actuator () and its environment () will be jointly called actuator system. At preferably evenly spaced points in time, a sensor () senses a condition of the actuator system. The sensor () may comprise several sensors. Preferably, the sensor () is an optical sensor that takes images of the environment (). An output signal(S) of the sensor () (or, in case the sensor () comprises a plurality of sensors, an output signal (S) for each of the sensors) which encodes the sensed condition is transmitted to the control system ().
Unknown
November 27, 2025
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.