Compression of deep neural networks

PublishedApril 23, 2024

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

In an approach for compressing a neural network, a processor receives a neural network, wherein the neural network has been trained on a set of training data. A processor receives a compression ratio. A processor compresses the neural network based on the compression ratio using an optimization model to solve for sparse weights. A processor re-trains the compressed neural network with the sparse weights. A processor outputs the re-trained neural network.

Patent Claims

8 claims

Legal claims defining the scope of protection, as filed with the USPTO.

2. The method of claim 1, wherein the compression ratio is a ratio of zero weights to nonzero weights that is based on an amount of memory storage available for running the re-trained neural network on a computing device.

3. The method of claim 2, wherein the computing device is a mobile device.

5. The method of claim 1, wherein re-training the compressed neural network with the sparse weights comprises minimizing to a pre-defined value an average squared difference between a label for an input of the neural network and final output of the neural network based on the set of sparse weights.

7. The computer program product of claim 6, wherein the compression ratio is a ratio of zero weights to nonzero weights that is based on an amount of memory storage available for running the re-trained neural network on a computing device.

8. The computer program product of claim 6, wherein the computing device is a mobile device.

10. The computer program product of claim 6, wherein the program instructions to re-train the compressed neural network with the sparse weights comprises program instructions to minimize to a pre-defined value an average squared difference between a label for an input of the neural network and final output of the neural network based on the set of sparse weights.

12. The computer system of claim 11, wherein the compression ratio is a ratio of zero weights to nonzero weights that is based on an amount of memory storage available for running the re-trained neural network on a computing device.

13. The computer system of claim 12, wherein the computing device is a mobile device.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06N

Patent Metadata

Filing Date

March 13, 2019

Publication Date

April 23, 2024

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search