US-11886359

AI accelerator apparatus using in-memory compute chiplet devices for transformer workloads

PublishedJanuary 30, 2024

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

An AI accelerator apparatus using in-memory compute chiplet devices. The apparatus includes one or more chiplets, each of which includes a plurality of tiles. Each tile includes a plurality of slices, a central processing unit (CPU), and a hardware dispatch device. Each slice can include a digital in-memory compute (DIMC) device configured to perform high throughput computations. In particular, the DIMC device can be configured to accelerate the computations of attention functions for transformer-based models (a.k.a. transformers) applied to machine learning applications. A single input multiple data (SIMD) device configured to further process the DIMC output and compute softmax functions for the attention functions. The chiplet can also include die-to-die (D2D) interconnects, a peripheral component interconnect express (PCIe) bus, a dynamic random access memory (DRAM) interface, and a global CPU interface to facilitate communication between the chiplets, memory and a server or host system.

Patent Claims

4 claims

Legal claims defining the scope of protection, as filed with the USPTO.

8. The apparatus of claim 1 wherein each of the chiplets comprises a plurality of tiles arranged symmetrically to each other, each of the tiles comprising a portion of the plurality of slices.

16. The device of claim 11 wherein each of the chiplets comprises a plurality of tiles arranged symmetrically to each other, each of the tiles comprising a portion of the plurality of slices.

17. The device of claim 11 wherein the DIMC device is configured to support one or more block floating point data types using a shared exponent or to support a block structured sparsity.

18. The device of claim 11 further comprising a network on chip (NoC) device configured for a multicast process and coupled to each of the plurality of slices.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06F G06N

Patent Metadata

Filing Date

October 17, 2022

Publication Date

January 30, 2024

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search