A computer-implemented method for improving the efficiency of computing an activation function in a neural network system includes initializing, by a controller, weights in a weight vector associated with the neural network system. Further, the method includes receiving, by the controller, an input vector of input values for computing a dot product with the weight vector for the activation function, which determines an output value of a node in the neural network system. The method further includes predicting, by a rectifier linear unit (ReLU), which computes the activation function, that the output value of the node will be negative based on computing an intermediate value for computing the dot product, and based on a magnitude of the intermediate value exceeding a precomputed threshold value. Further, the method includes, in response to the prediction, terminating, by the ReLU, the computation of the dot product, and outputting a 0 as the output value.
Legal claims defining the scope of protection, as filed with the USPTO.
2. The computer-implemented method of claim 1, wherein the intermediate value is computed at each b-th computation cycle using the b-th bit from each input value, and a sign of the intermediate value is negative.
3. The computer-implemented method of claim 2, wherein the precomputed threshold value is unique to the b-th computation cycle.
4. The computer-implemented method of claim 2 further comprising, determining a respective precomputed threshold for each computation cycle corresponding to a number of bits used to represent each of the input values.
5. The computer-implemented method of claim 4, wherein the plurality of precomputed thresholds is stored in a threshold table in the ReLU.
6. The computer-implemented method of claim 1 further comprising, in response to predicting that the output value will be non-negative, continuing the computation of the dot product.
8. The system of claim 7, wherein the intermediate value is computed at each b-th computation cycle using the b-th bit from each input value, and a sign of the intermediate value is negative.
9. The system of claim 8, wherein the precomputed threshold value is particular to the b-th computation cycle.
10. The system of claim 8, wherein the method further comprises, determining a respective precomputed threshold for each computation cycle corresponding to a number of bits used to represent each of the input values.
11. The system of claim 10, wherein the plurality of precomputed thresholds is stored in a threshold table in the ReLU.
12. The system of claim 7, wherein the method further comprises, in response to predicting that the output value will be non-negative, continuing the computation of the dot product.
14. The ReLU of claim 13, wherein the intermediate value is computed at each b-th computation cycle using the b-th bit from each input value, and a sign of the intermediate value is negative.
15. The ReLU of claim 14, wherein the precomputed threshold value is particular to the b-th computation cycle.
16. The ReLU of claim 14, wherein the method further comprises, determining a respective precomputed threshold for each computation cycle corresponding to a number of bits used to represent each of the input values.
17. The ReLU of claim 16, wherein the plurality of precomputed thresholds is stored in a threshold table in the storage medium.
18. The ReLU of claim 13, wherein the method further comprises, in response to predicting that the output value will be non-negative, continuing the computation of the dot product.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
February 21, 2020
January 24, 2023
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.