10558430

Neural network engine

PublishedFebruary 11, 2020
Assigneenot available in USPTO data we have
Technical Abstract

Patent Claims
17 claims

Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.

Claim 1

Original Legal Text

1. A neural network engine configured to receive at least one M×N window of floating point number values corresponding to a pixel of an input map and a corresponding set of M×N floating point number kernel values for a neural network layer of a neural network, the neural network engine comprising: a plurality of M×N floating point multipliers, each floating point multiplier having a first operand input configured to be connected to an input map value and a second operand input configured to be connected to a corresponding kernel value; pairs of multipliers within said M×N floating point multipliers providing respective floating point number outputs to respective input nodes of a tree of nodes, each node of said tree being configured to provide a floating point number output corresponding to either: a larger of inputs of said node; or a sum of said inputs, one output node of said tree providing a first input of an output logic, and an output of one of said M×N floating point multipliers providing a second input of said output logic; wherein when said engine is configured to process a convolution layer of the neural network, each of said kernel values comprises a trained value for said layer, said nodes of the tree are configured to sum their inputs and said output logic is configured to sum its first and second inputs, to apply an activation function to said sum and to provide an output of said activation function as an output of said output logic; wherein when said engine is configured to process an average pooling layer of the neural network, each of said kernel values comprises a value corresponding to 1/(M×N), said nodes of the tree are configured to sum their inputs and said output logic is configured to sum its first and second inputs and to provide said sum as said output of said output logic; and wherein when said engine is configured to process a max pooling layer of the neural network, each of said kernel values comprises a value equal to 1, said nodes of the tree are configured to output a larger of their inputs and said output logic is configured to output a larger of its first and second inputs as said output of said output logic.

Plain English Translation

This invention relates to a neural network engine designed to efficiently process convolution, average pooling, and max pooling operations in neural networks. The engine receives an M×N window of floating-point input values from an input map and a corresponding M×N set of kernel values for a neural network layer. The engine includes an array of M×N floating-point multipliers, each multiplying an input map value with a kernel value. The outputs of these multipliers feed into a tree of nodes, where each node either sums its inputs or selects the larger of its inputs, depending on the operation being performed. The final output is generated by an output logic that processes the tree's output alongside one of the multiplier outputs. For convolution layers, the kernel values are trained weights, and the tree and output logic perform summation followed by an activation function. For average pooling, the kernel values are set to 1/(M×N), and the tree and output logic compute the average of the inputs. For max pooling, the kernel values are set to 1, and the tree and output logic select the maximum value. This design enables flexible and efficient execution of different neural network operations using a shared hardware structure.

Claim 2

Original Legal Text

2. A neural network engine according to claim 1 wherein said activation function is one of: an identity function, a ReLU function or a PReLU function.

Plain English Translation

A neural network engine is designed to process input data through a series of interconnected nodes organized in layers, where each node applies an activation function to its weighted inputs to produce an output. The activation function determines the non-linearity of the network, enabling it to learn complex patterns. This neural network engine includes a configurable activation function that can be selected from a predefined set of options. The available activation functions include an identity function, which outputs the input unchanged, a ReLU (Rectified Linear Unit) function, which outputs the input if positive and zero otherwise, and a PReLU (Parametric ReLU) function, which introduces a learnable slope for negative inputs. These functions help mitigate issues like vanishing gradients and improve model performance by introducing controlled non-linearity. The selection of the activation function allows the neural network to adapt to different types of data and tasks, enhancing its flexibility and accuracy. The engine processes input data through these layers, applying the chosen activation function at each node to transform the data and produce a final output. This design enables efficient training and inference while maintaining computational efficiency.

Claim 3

Original Legal Text

3. A neural network engine according to claim 1 wherein said activation function is defined by a binary slope parameter and a slope coefficient.

Plain English Translation

A neural network engine is designed to improve computational efficiency and performance in machine learning applications. The engine addresses the challenge of optimizing neural network operations by incorporating a specialized activation function. This activation function is defined by a binary slope parameter and a slope coefficient, which together control the function's behavior during neural network computations. The binary slope parameter determines whether the activation function operates in a linear or non-linear mode, while the slope coefficient adjusts the steepness or gradient of the function's output. This design allows the neural network to dynamically adapt its activation behavior based on input data characteristics, enhancing accuracy and reducing computational overhead. The engine may be integrated into various machine learning systems, including deep learning frameworks, to improve training and inference efficiency. The activation function's parameters are configurable, enabling customization for different neural network architectures and tasks. This approach optimizes resource utilization and accelerates neural network processing without sacrificing performance.

Claim 5

Original Legal Text

5. A neural network engine according to claim 4 , wherein said engine is configured to implement said activation function by modifying an exponent of nput x.

Plain English Translation

A neural network engine is designed to enhance computational efficiency and accuracy in deep learning models by optimizing the activation function. The activation function is a critical component that introduces non-linearity into neural networks, enabling them to learn complex patterns. Traditional activation functions, such as ReLU or sigmoid, have limitations in terms of computational cost, gradient saturation, or vanishing gradients, which can hinder model performance. This neural network engine addresses these challenges by implementing an activation function that modifies the exponent of the input value x. By adjusting the exponent, the engine can dynamically control the non-linearity and gradient behavior, improving convergence speed and accuracy. The engine may include multiple layers, each with configurable parameters, to further optimize the network's performance. The activation function's exponent modification allows for fine-tuned control over the network's response to different input ranges, reducing issues like gradient saturation or vanishing gradients. This approach enhances the network's ability to learn from data efficiently, making it suitable for applications requiring high precision and computational efficiency, such as image recognition, natural language processing, or autonomous systems. The engine's flexibility in adjusting the exponent ensures adaptability to various problem domains, improving overall model robustness.

Claim 6

Original Legal Text

6. A neural network engine according to claim 1 , wherein when said engine is configured to process a peak finding layer of the neural network, each of said kernel values comprises a value equal to 1, said nodes of the tree are configured to output a larger of their inputs and said output logic is configured to output a larger of its first and second inputs as said output of said output logic, and wherein said engine is configured to provide an image value from within said M×N window corresponding to said pixel as an input to said one of said M×N floating point multipliers, and to provide the output of said one of said M×N floating point multipliers to said second input of said output logic.

Plain English Translation

This invention relates to a neural network engine designed for peak finding operations in image processing. The engine processes a peak finding layer within a neural network, where each kernel value in the layer is set to 1. The engine includes a tree structure where nodes output the larger of their inputs, and an output logic that selects the larger of its two inputs as the final output. The engine is configured to take an image value from an M×N window centered around a pixel, feed it into one of the M×N floating-point multipliers, and then provide the multiplier's output to the second input of the output logic. This setup allows the engine to efficiently identify the maximum value within the window, which is useful for tasks like edge detection or feature extraction in image processing. The design leverages parallel processing through the multipliers and tree-based comparison logic to enhance computational efficiency while maintaining accuracy in peak detection. The engine's architecture ensures that the peak value within the defined window is accurately determined, supporting applications in computer vision and deep learning where precise feature identification is critical.

Claim 7

Original Legal Text

7. A neural network engine according to claim 6 wherein said output module is configured to provide a second output indicating if said output module second input is greater than said output module first input.

Plain English Translation

A neural network engine processes input data through a series of interconnected nodes to generate outputs. The engine includes an output module that receives two inputs and produces a second output indicating whether the second input is greater than the first input. This comparison function allows the neural network to make binary decisions based on the relative magnitude of its processed signals. The output module may be part of a larger neural network architecture designed for tasks requiring threshold-based or comparative evaluations, such as classification, regression, or decision-making. The comparison operation enables the network to incorporate conditional logic, improving its ability to handle complex decision boundaries or prioritize inputs dynamically. This feature is particularly useful in applications where relative signal strength or ranking is critical, such as in attention mechanisms, ranking systems, or control systems requiring threshold-based actions. The neural network engine may be implemented in hardware or software, with the output module optimized for low-latency or high-throughput processing depending on the application requirements. The comparison function can be integrated into various neural network layers, including fully connected, convolutional, or recurrent layers, to enhance the network's decision-making capabilities.

Claim 8

Original Legal Text

8. A neural network engine according to claim 1 wherein said engine is further configured to provide a plurality of outputs from penultimate nodes of said tree so that when said engine is configured to process a pooling layer, said engine simultaneously provides a plurality of pooling results for respective sub-windows of said M×N window.

Plain English Translation

A neural network engine processes input data through a hierarchical tree structure of nodes, where each node performs computations on data received from parent nodes. The engine is configured to generate multiple outputs from penultimate nodes in the tree, allowing simultaneous processing of a pooling layer across different sub-windows of an M×N input window. This enables parallel computation of pooling results for each sub-window, improving efficiency by reducing sequential processing steps. The tree structure allows for hierarchical feature extraction, where lower-level nodes process raw input data and higher-level nodes combine features from multiple lower-level nodes. The pooling layer aggregates information from the penultimate nodes, and the simultaneous output of multiple pooling results accelerates tasks such as image recognition or feature extraction by processing multiple regions of the input data in parallel. This approach enhances computational efficiency and reduces latency in neural network inference tasks.

Claim 9

Original Legal Text

9. A neural network engine according to claim 1 , wherein the neural network engine is further configured to implement logic to perform at least one of: process the convolution layer, process the average pooling layer, or process the max pooling layer.

Plain English Translation

A neural network engine is designed to accelerate deep learning computations, particularly for convolutional neural networks (CNNs). The engine includes specialized hardware to efficiently execute neural network operations, reducing latency and improving energy efficiency compared to general-purpose processors. The engine is configured to handle key CNN operations, including convolution layers, average pooling layers, and max pooling layers. Convolution layers apply filters to input data to extract features, while pooling layers reduce spatial dimensions through averaging or selecting maximum values. The engine optimizes these operations by leveraging parallel processing and dedicated hardware accelerators, enabling faster inference and training. This design addresses the computational bottlenecks in deep learning workloads, making it suitable for applications in real-time image recognition, autonomous systems, and edge computing devices. The engine's flexibility allows it to adapt to different neural network architectures, ensuring broad applicability across various AI tasks.

Claim 10

Original Legal Text

10. A neural network engine according to claim 1 , wherein said output logic is further configured to receive a control input value and process, based at least in part on the received control input value, at least one of: the convolution layer, the average pooling layer, or the max pooling layer.

Plain English Translation

A neural network engine processes input data through a series of layers, including convolution, average pooling, and max pooling, to generate an output. The engine includes output logic that can dynamically adjust the processing of these layers based on a control input value. This control input value allows the neural network to modify the behavior of the convolution layer, average pooling layer, or max pooling layer during operation. For example, the control input may alter the kernel size, stride, or activation function in the convolution layer, or it may change the pooling window size or type in the pooling layers. This adaptability enables the neural network to optimize performance for different tasks or input conditions without requiring a complete retraining of the model. The dynamic adjustment of these layers enhances flexibility and efficiency in various applications, such as image recognition, natural language processing, or real-time decision-making systems. The control input can be derived from external sensors, user feedback, or other system parameters, allowing the neural network to respond to changing environments or requirements. This approach improves the network's ability to handle diverse inputs and tasks while maintaining computational efficiency.

Claim 11

Original Legal Text

11. A device comprising: hardware programmed to: configure M×N floating point multipliers, each floating point multiplier having a first operand input configured to be connected to an input map value and a second operand input configured to be connected to a corresponding kernel value; provide the first operand input and the second operand input to an input node of a tree of nodes, the input node of the tree being configured to provide a floating point number output corresponding to either: a larger of the first operand input and the second operand input; or a sum of the first operand input and the second operand input; provide a first output node of the tree to a first input of an output logic and provide an output of one of the M×N floating point multipliers to a second input of the output logic; and configure the output logic to: process a convolution layer of a neural network, wherein the kernel value comprises a trained value for the convolution layer, the nodes of the tree are configured to sum the first operand input and the second operand input and the output logic is further configured to sum its first and second inputs, to apply an activation function to the sum and to provide an output of the activation function as the output of the output logic; process an average pooling layer of the neural network, wherein the kernel value comprises a value corresponding to 1/(M×N), the nodes of the tree are configured to sum the first operand input and the second operand input and the output logic is further configured to sum its first and second inputs and to provide the sum as the output of the output logic; and process a max pooling layer of the neural network, wherein the kernel value comprises a value equal to 1, the nodes of the tree are configured to output a larger of the first operand input and the second operand input and the output logic is further configured to output a larger of its first and second inputs as the output of the output logic.

Plain English Translation

The device is designed for efficient neural network processing, specifically for convolution, average pooling, and max pooling operations. It addresses the challenge of implementing these operations in hardware with minimal computational overhead. The device includes a grid of M×N floating-point multipliers, each receiving an input map value and a corresponding kernel value. These multipliers feed into a tree of nodes that can either sum the inputs or select the larger value, depending on the operation being performed. The tree's output is combined with a direct multiplier output in an output logic unit. For convolution, the kernel values are trained weights, and the tree sums inputs, with the output logic summing results and applying an activation function. For average pooling, the kernel values are set to 1/(M×N), and the tree sums inputs, with the output logic providing the average. For max pooling, the kernel values are 1, and the tree selects the larger input, with the output logic propagating the maximum value. This architecture enables flexible, hardware-accelerated neural network processing with shared components for different operations.

Claim 12

Original Legal Text

12. A device according to claim 11 , wherein said output logic is further configured to receive a control input value and process, based at least in part on the received control input value, at least one of: the convolution layer, the average pooling layer, or the max pooling layer.

Plain English Translation

This invention relates to a neural network processing device designed to enhance computational efficiency and flexibility in deep learning systems. The device includes a configurable neural network architecture with at least one convolution layer, an average pooling layer, and a max pooling layer. These layers are dynamically adjustable to optimize performance for specific tasks. The output logic of the device is further enhanced to receive a control input value, which allows real-time modification of the convolution, average pooling, or max pooling layers based on the received input. This adaptability enables the device to adjust its processing behavior dynamically, improving efficiency and accuracy in tasks such as image recognition, natural language processing, or other machine learning applications. The control input value can be derived from external sensors, user inputs, or other system feedback, allowing the device to respond to changing conditions or requirements. This dynamic adjustment capability distinguishes the device from static neural network architectures, providing greater flexibility and performance optimization in real-world applications.

Claim 13

Original Legal Text

13. A device according to claim 11 , wherein said activation function is one of: an identity function, a ReLU function or a PReLU function.

Plain English Translation

Technical Summary: This invention relates to neural network devices, specifically focusing on the activation functions used in artificial neural networks to introduce non-linearity and enhance learning capabilities. The problem addressed is the selection of appropriate activation functions to optimize performance, efficiency, and adaptability in neural network architectures. The device incorporates an activation function that can be dynamically configured as one of three types: an identity function, a Rectified Linear Unit (ReLU) function, or a Parametric ReLU (PReLU) function. The identity function outputs the input directly without modification, useful in certain linear transformations. The ReLU function applies a threshold at zero, outputting the input if positive and zero otherwise, which helps mitigate vanishing gradient issues. The PReLU function extends ReLU by introducing a learnable slope parameter for negative inputs, allowing the network to adaptively adjust the activation behavior for improved performance. The device enables flexible activation function selection, allowing optimization for different tasks, such as improving convergence speed, reducing computational overhead, or enhancing model expressiveness. This adaptability is particularly beneficial in deep learning applications where different layers or neurons may require distinct activation behaviors. The invention aims to provide a versatile solution for neural network design, balancing computational efficiency and learning effectiveness.

Claim 14

Original Legal Text

14. A device according to claim 11 , wherein said activation function is defined by a binary slope parameter and a slope coefficient.

Plain English Translation

The invention relates to a device for processing signals, particularly in the context of neural networks or other computational systems requiring activation functions. The problem addressed is the need for flexible and efficient activation functions that can be dynamically adjusted to improve performance in various computational tasks. The device includes an activation function that is defined by a binary slope parameter and a slope coefficient. The binary slope parameter determines whether the activation function operates in a linear or non-linear mode, while the slope coefficient adjusts the steepness or gradient of the function. This allows the device to switch between different activation behaviors, such as linear or piecewise-linear functions, depending on the computational requirements. The activation function can be applied to input signals to produce output signals that are then used in further processing stages. The device may also include a memory for storing the binary slope parameter and slope coefficient, as well as a processing unit that applies the activation function to input signals based on these parameters. The activation function can be implemented in hardware or software, depending on the application. The binary slope parameter and slope coefficient can be dynamically updated to adapt the activation function to different tasks or operating conditions, improving the device's versatility and efficiency. This approach enables more precise control over the activation function's behavior, leading to better performance in tasks such as classification, regression, or signal processing.

Claim 16

Original Legal Text

16. A method comprising: configuring floating point multipliers, each floating point multiplier having a first operand input configured to be connected to an input map value and a second operand input configured to be connected to a corresponding kernel value; providing the first operand input and the second operand input to an input node of a tree of nodes, the input node of the tree being configured to provide a floating point number output corresponding to either: a larger of the first operand input and the second operand input; or a sum of the first operand input and the second operand input; providing a first output node of the tree to a first input of an output logic; providing one of the floating point multipliers to a second input of the output logic; and processing, by the output logic, at least one of: a convolution layer of a neural network, wherein the kernel value comprises a trained value for the convolution layer, the nodes of the tree are configured to sum the first operand input and the second operand input and wherein the output logic is configured to sum its first and second inputs, to apply an activation function to the sum and to provide an output of the activation function as the output of the output logic; an average pooling layer of the neural network, wherein the kernel value comprises a value corresponding to 1/(M×N), the nodes of the tree are configured to sum the first operand input and the second operand input and wherein the output logic is configured to sum its first and second inputs and to provide the sum as the output of the output logic; or a max pooling layer of the neural network, wherein the kernel value comprises a value equal to 1, the nodes of the tree are configured to output a larger of the first operand input and the second operand input and wherein the output logic is configured to output a larger of its first and second inputs as the output of the output logic.

Plain English Translation

This invention relates to a method for processing neural network layers, specifically convolution, average pooling, and max pooling layers, using a configurable hardware architecture. The method addresses the need for efficient and flexible computation in neural networks by leveraging floating-point multipliers and a tree of nodes to perform operations without redundant hardware. The method involves configuring floating-point multipliers, each with two operand inputs: one connected to an input map value and the other to a corresponding kernel value. These inputs are fed into a tree of nodes, where each node can either output the larger of the two inputs or their sum, depending on the layer type being processed. The tree's output is provided to a first input of an output logic, while one of the floating-point multipliers is connected to a second input of the output logic. For convolution layers, the kernel values are trained weights, the tree nodes sum the inputs, and the output logic sums its inputs, applies an activation function, and outputs the result. For average pooling, the kernel values are set to 1/(M×N), where M and N are pooling dimensions, and the output logic sums its inputs without further processing. For max pooling, the kernel values are set to 1, the tree nodes output the larger input, and the output logic selects the larger of its two inputs. This approach allows a single hardware structure to efficiently handle multiple neural network operations.

Claim 17

Original Legal Text

17. A method according to claim 16 , wherein the processing by the output logic is based at least in part on a value associated with a control input.

Plain English Translation

A method for processing data in a computing system involves generating an output signal based on input data and a control input. The system includes a processing unit that receives input data and a control input, where the control input influences the processing logic. The processing unit applies a transformation to the input data, such as filtering, encoding, or mathematical operations, to produce an intermediate result. An output logic module then processes this intermediate result to generate a final output signal. The output logic adjusts its processing based on a value associated with the control input, allowing dynamic modification of the output behavior. For example, the control input may determine parameters like gain, threshold levels, or selection criteria within the output logic. This method enables adaptive processing where the output signal can be tailored in real-time according to varying control inputs, improving flexibility in applications such as signal processing, data encoding, or control systems. The control input may be derived from user settings, sensor feedback, or other system parameters, ensuring the output logic responds appropriately to changing conditions.

Claim 18

Original Legal Text

18. A method according to claim 16 , wherein said activation function is one of: an identity function, a ReLU function or a PReLU function.

Plain English Translation

This invention relates to neural network activation functions used in machine learning systems. The problem addressed is selecting an appropriate activation function to improve the performance and efficiency of neural networks. Activation functions are mathematical operations applied to the output of a neuron to introduce non-linearity, enabling the network to learn complex patterns. Common activation functions include the identity function, which outputs the input unchanged, the Rectified Linear Unit (ReLU) function, which outputs the input if positive or zero otherwise, and the Parametric ReLU (PReLU) function, which generalizes ReLU by allowing a learnable slope for negative inputs. The invention provides a method for selecting one of these activation functions during neural network training to optimize performance. The method involves evaluating the network's performance with different activation functions and choosing the one that yields the best results. This approach helps avoid suboptimal performance caused by an improperly chosen activation function, improving accuracy and computational efficiency. The invention is particularly useful in deep learning applications where the choice of activation function significantly impacts model performance.

Claim 19

Original Legal Text

19. A method according to claim 16 , wherein said activation function is defined by a binary slope parameter and a slope coefficient.

Plain English Translation

This invention relates to neural network activation functions, specifically addressing the need for improved flexibility and efficiency in defining activation functions for artificial neural networks. The method involves a novel approach to configuring activation functions using a binary slope parameter and a slope coefficient. The binary slope parameter determines whether the activation function will have a positive or negative slope, effectively controlling the direction of the function's output. The slope coefficient further refines the steepness or gradient of the activation function, allowing for precise tuning of the network's behavior. This configuration enables more adaptable and computationally efficient activation functions, improving the performance of neural networks in tasks such as classification, regression, and pattern recognition. The method can be applied to various types of neural networks, including feedforward, convolutional, and recurrent networks, to enhance their learning capabilities and generalization. By using a binary slope parameter and a slope coefficient, the activation function can be dynamically adjusted during training or inference, optimizing the network's response to different input patterns. This approach reduces the complexity of defining activation functions while maintaining high accuracy and adaptability.

Patent Metadata

Filing Date

Unknown

Publication Date

February 11, 2020

Inventors

Mihai Constantin MUNTEANU

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, FAQs, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Neural network engine” (10558430). https://patentable.app/patents/10558430

© 2026 Nomic Interactive Technology LLC. Machine-readable context available at /api/llm-context/10558430. See llms.txt for full attribution policy.