US-11481317

Extended memory architecture

PublishedOctober 25, 2022

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

Systems, apparatuses, and methods related to extended memory communication subsystems for performing extended memory operations are described. An example apparatus can include a plurality of computing devices. Each of the computing devices can include a processing unit configured to perform an operation on a block of data, and a memory array configured as a cache for each respective processing unit. The example apparatus can further include a first communication subsystem coupled to a host and to each of the plurality of communication subsystems. The example apparatus can further include a plurality of second communication subsystems coupled to each of the plurality of computing devices. Each of the plurality of computing devices can be configured to receive a request from the host, send a command to execute at least a portion of the operation, and receive a result of performing the operation from the at least one hardware accelerator.

Patent Claims

18 claims

Legal claims defining the scope of protection, as filed with the USPTO.

2. The apparatus of claim 1, wherein the plurality of second communication subsystems comprises a plurality of interconnect interfaces.

3. The apparatus of claim 1, wherein the first communication subsystem is a peripheral component interconnect express (PCIe) interface.

4. The apparatus of claim 1, wherein a particular one of the plurality of second communication subsystems is a controller and the controller is coupled to a memory device.

5. The apparatus of claim 4, wherein the memory device comprises at least one of a double data rate (DDR) memory, a three-dimensional (3D) cross-point memory, a NAND memory, or any combination thereof.

6. The apparatus of claim 1, wherein the accelerator is on-chip and is coupled to a static random access device (SRAM).

7. The apparatus of claim 1, wherein the accelerator is on-chip and is coupled to an arithmetic logic unit (ALU) configured to perform an arithmetic operation or a logical operation, or both.

8. The apparatus of claim 1, wherein the processing unit of each of the plurality of computing devices is configured with a reduced instruction set architecture.

10. The apparatus of claim 1, wherein each of the plurality of computing devices is configured as reduced instruction set computer (RISC) compliant.

11. The apparatus of claim 1, wherein the at least one hardware accelerator is configured to perform the extended memory operation by accessing a non-volatile memory device coupled to the plurality of second communication subsystems.

12. The apparatus of claim 1, wherein the at least one hardware accelerator is configured to send a request for an additional hardware accelerator to perform an additional portion of the extended memory operation.

14. The system of claim 13, wherein the plurality of computing devices, the first communication subsystem, and the second plurality of communication subsystems are configured on a field programmable gate array (FPGA) and the non-volatile memory device is external to the FPGA.

15. The system of claim 13, wherein the plurality of computing devices are each configured as a reduced instruction set computer (RISC)-V compliant.

18. The system of claim 17, wherein the peripheral port is coupled to an off-chip serial port through the first communication subsystem and through at least one of the second plurality of communication subsystems.

19. The system of claim 13, wherein the first communication subsystem is directly coupled to at least one of the second plurality of communication subsystems.

20. The system of claim 19, wherein the at least one of the second plurality of communication subsystem is configured to transfer the block of data from the non-volatile memory device to the first communication subsystem and to the host, wherein the transfer of the block of data bypasses the plurality of computing devices.

21. The system of claim 19, wherein an AXI interconnect that directly couples the first communication subsystem to at least one of the second plurality of communication subsystem is a faster AXI interconnect than an AXI interconnect that couples the plurality of computing devices to the first communication subsystem and to the at least one of the second plurality of communication subsystems.

23. The method of claim 22, wherein the reduced size block of data is transferred to the host via a PCIe interface coupled to the first communication subsystem.

24. The method of claim 22, further comprising causing, using a memory controller, the block of data to be transferred from the memory device to the second communication subsystem and subsequently to the first communication subsystem, wherein the block of data bypasses the plurality of computing devices.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06F

Patent Metadata

Filing Date

June 26, 2020

Publication Date

October 25, 2022

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search