US-10592801

Apparatus and methods for forward propagation in convolutional neural networks

PublishedMarch 17, 2020

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

Aspects for forward propagation of a convolutional artificial neural network are described herein. The aspects may include a direct memory access unit configured to receive input data from a storage device and a master computation module configured to select one or more portions of the input data based on a predetermined convolution window. Further, the aspects may include one or more slave computation modules respectively configured to convolute a convolution kernel with one of the one or more portions of the input data to generate a slave output value. Further still, the aspects may include an interconnection unit configured to combine the one or more slave output values into one or more intermediate result vectors, wherein the master computation module is further configured to merge the one or more intermediate result vectors into a merged intermediate vector.

Patent Claims

17 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. An apparatus for forward propagation of a convolutional neural network, comprising: a master computation circuit configured to receive input data, and select, in response to an instruction, one or more portions of the input data based on a predetermined convolution window, wherein the instruction is selected from the group consisting of a convolution network sigmoid instruction, a convolution network tanh instruction, a convolution network relu instruction, and a convolution network group instruction, wherein the instruction includes a first address of the one or more portions of the input data, a first size of the one or more portions of the input data, a second address of the portion of the convolution kernel, a second size of the portion of the convolution kernel, wherein the convolution network sigmoid instruction further includes an indication of a sigmoid function as an activation function, wherein the convolution network tanh instruction further includes an indication of a tanh function as the activation function, wherein the convolution network relu instruction further includes an indication of a relu function as the activation function, and wherein the convolution network group instruction further includes an output address; one or more slave computation circuits respectively configured to convolute a portion of a convolution kernel with one of the one or more portions of the input data to generate a slave output value; and an interconnection circuit configured to combine the one or more slave output values into one or more intermediate result vectors, wherein the master computation circuit is further configured to merge the one or more intermediate result vectors into a merged intermediate vector.

2. The apparatus of claim 1 , wherein each of the one or more slave computation circuits includes a slave neuron caching circuit configured to store one of the one or more portions of the input data.

3. The apparatus of claim 1 , wherein each of the one or more slave computation circuits includes a weight value caching circuit configured to store the portion of the convolution kernel that corresponds to the slave computation circuit.

4. The apparatus of claim 1 , wherein each of the one or more slave computation circuits includes a vector multiplier configured to multiply the portion of the convolution kernel with each of the one or more portions of the input data.

5. The apparatus of claim 4 , wherein each of the one or more slave computation circuits includes an adder configured to sum results of a multiplication of the portion of the convolution kernel with each of the one or more portions of the input data to generate the slave output value.

6. The apparatus of claim 1 , wherein the master computation circuit includes a merging circuit configured to merge the one or more intermediate result vectors into the merged intermediate vector.

7. The apparatus of claim 6 , wherein the master computation circuit includes a master neuron caching circuit configured to store a bias value; and an adder configured to add the bias value to the merged intermediate vector to generate a biased intermediate vector.

8. The apparatus of claim 7 , wherein the master computation circuit includes an activator configured to activate the biased intermediate vector by applying an activation function to the biased intermediate vector.

9. The apparatus of claim 8 , wherein the activation function is a function indicated by the instruction and selected from the group consisting of a sigmoid function, a tanh function, a relu function, and a softmax function.

10. A method for forward propagation of a convolutional neural network, comprising: receiving, by a master computation circuit, input data; selecting, by the master computation circuit, one or more portions of the input data based on a predetermined convolution window in response to an instruction, wherein the instruction is selected from the group consisting of a convolution network sigmoid instruction, a convolution network tanh instruction, a convolution network relu instruction, and a convolution network group instruction, wherein the instruction includes a first address of the one or more portions of the input data, a first size of the one or more portions of the input data, a second address of the portion of the convolution kernel, a second size of the portion of the convolution kernel, wherein the convolution network sigmoid instruction further includes an indication of a sigmoid function as an activation function, wherein the convolution network tanh instruction further includes an indication of a tanh function as the activation function, wherein the convolution network relu instruction further includes an indication of a relu function as the activation function, and wherein the convolution network group instruction further includes an output address; convoluting, by one of one or more slave computation circuits, a portion of a convolution kernel with one of the one or more portions of the input data to generate a slave output value; and combining, by an interconnection circuit, the one or more slave output values into one or more intermediate result vectors.

11. The method of claim 10 , further comprising merging, by the master computation circuit, the one or more intermediate result vectors into a merged intermediate vector.

12. The method of claim 10 , further comprising storing, by a slave neuron caching circuit of each of the one or more slave computation circuits, one of the one or more portions of the input data.

13. The method of claim 10 , further comprising storing, by a weight value caching circuit of each of the one or more slave computation circuits, the portion of the convolution kernel that corresponds to the slave computation module circuit.

14. The method of claim 13 , wherein the convoluting further comprises adding, by an adder of each of the one or more slave computation circuits, results of a multiplication of the portion of the convolution kernel with each of the one or more portions of the input data to generate the slave output value.

15. The method of claim 10 , further comprising: storing, by a master neuron caching circuit of the master computation circuit, a bias value; and adding, by an adder of the master computation circuit, the bias value to the merged intermediate vector to generate a biased intermediate vector.

16. The method of claim 15 , further comprising activating, by an activator, the biased intermediate vector by applying an activation function to the biased intermediate vector.

17. The method of claim 16 , wherein the activation function is a function selected from the group consisting of a sigmoid function, a tanh function, a relu function, and a softmax function.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06N G06F

Patent Metadata

Filing Date

October 29, 2018

Publication Date

March 17, 2020

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search