US-10817442

Apparatus and methods for in data path compute operations

PublishedOctober 27, 2020

Assigneenot available in USPTO data we have

Inventorsnot available in USPTO data we have

Technical Abstract

The present disclosure includes apparatuses and methods for in data path compute operations. An example apparatus includes an array of memory cells. Sensing circuitry is selectably coupled to the array. A plurality of shared input/output (I/O) lines provides a data path. The plurality of shared I/O lines selectably couples a first subrow of a row of the array via the sensing circuitry to a first compute component in the data path to move a first data value from the first subrow to the first compute component and a second subrow of the respective row via the sensing circuitry to a second compute component to move a second data value from the second subrow to the second compute component. An operation is performed on the first data value from the first subrow using the first compute component substantially simultaneously with movement of the second data value from the second subrow to the second compute component.

Patent Claims

20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. An apparatus, comprising: a plurality of input/output (I/O) lines shared to enable performance of neural network data processing on a stream of data values, the I/O lines configured to selectably couple via sensing circuitry: a first subrow of a row of an array of memory cells to a first compute component in a data path to move a first data value from the first subrow to the first compute component; and a second subrow of the respective row to a second compute component in the data path to move a second data value from the second subrow to the second compute component; and a controller configured to direct the performance of the neural network data processing by: a compute operation performed by the first compute component on the first data value moved from the first subrow substantially simultaneously with movement of the second data value from the second subrow to the second compute component; and selectable coupling of a plurality of subrows to a corresponding plurality of compute components in a selectable number of logic stripes in the data path.

2. The apparatus of claim 1 , wherein the data path further comprises: a first logic stripe including a number of a plurality of first compute components that corresponds to a number of a plurality of memory cells of the first subrow; and a second logic stripe including a number of a plurality of second compute components that corresponds to a number of a plurality of memory cells of the second subrow.

3. The apparatus of claim 1 , wherein the data path further comprises: a compute unit including a plurality of logic stripes that each includes a plurality of compute components; wherein each of the plurality of compute components is associated with at least one of the plurality of shared I/O lines local to the array.

4. The apparatus of claim 1 , wherein the data path further comprises a number of a plurality of logic stripes that corresponds to a number of a plurality of subrows of the respective row.

5. The apparatus of claim 1 , wherein the data path further comprises: a number of the plurality of shared I/O lines that corresponds to a number of a plurality of memory cells of a subrow of the respective row; wherein a logic stripe includes a number of a plurality of compute components that corresponds to the number of the plurality of memory cells of the subrow coupled to a respective logic stripe.

6. A system, comprising: a host configured to generate instructions for performance of an alpha blend graphics operation; and a memory device coupled to the host, wherein the memory device is configured to receive the instructions and perform the alpha blend graphics operation by: a plurality of logic stripes in a data path for in data path compute operations, comprising a first logic stripe including a number of a plurality of first compute components that corresponds to a number of a plurality of memory cells of a first subrow of a row of the array; and control circuitry configured to direct execution of the instructions by: movement of a first data value from a first subrow of a first row of the array, via an input/output (I/O) line shared in the data path, to a first compute component of a first logic stripe in the data path; performance of a first operation on the first data value from the first subrow using the first compute component; movement of a second data value, resulting from performance of the first operation, from the first logic stripe via connection circuitry to a second compute component of a second logic stripe in the data path; and overlay of data values of a first subrow corresponding to a first image with data values of a second subrow corresponding to a second image to form a background image and a foreground image in a combined image.

7. The system of claim 6 , wherein the host comprises a processing resource configured to direct: input of the data values, corresponding to the first image and the second image, for storage by an array of memory cells in the memory device; and input of the instructions to the control circuitry.

8. The system of claim 6 , wherein the control circuitry comprises a state machine.

9. The system of claim 6 , wherein the control circuitry comprises a sequencer.

10. The system of claim 6 , wherein the control circuitry comprises a shift controller configured to control shifting data in the memory device.

11. The system of claim 6 , further comprising a control bus configured to provide the instructions from the host as signals to be decoded by the control circuitry.

12. The system of claim 6 , wherein the control circuitry is local to the memory device and external to the host.

13. The system of claim 6 , wherein the control circuitry is further configured to execute instructions from the host to direct performance of a second operation on the second data value using the second compute component of the second logic stripe.

14. The system of claim 6 , wherein the control circuitry is further configured to execute instructions from the host to direct movement of a third data value, resulting from performance of the second operation, from the second logic stripe via the connection circuitry to a third compute component of a third logic stripe.

15. The system of claim 6 , wherein the control circuitry is further configured to execute instructions from the host to direct: performance of a number of a plurality of logical operation sequences by systolic movement of logical operation results through a corresponding number of a plurality of logic stripes; wherein a number of a plurality of the logical operation results are computed using a corresponding number of a plurality of compute components of the corresponding number of the plurality of logic stripes.

16. The system of claim 6 , wherein: the plurality of logic stripes includes a number of a plurality of regions that corresponds to a number of a plurality of sequences of logical operations; the control circuitry is further configured to execute instructions from the host to direct initiation of the plurality of sequences of logical operations substantially simultaneously; and each of the plurality of sequences of logical operations is directed to be performed in a different one of the plurality of regions.

17. A method for operating a memory device, comprising: performing a number of matrix multiplication operations on a plurality of data value matrices by: performance of a first operation on a data value moved from a memory cell in a first subrow in a first row of an array of memory cells to a first logic stripe for in data path compute operations; and movement of the data value, to enable performance of a second operation thereon, to a selected second logic stripe via connection circuitry selectably coupling the first logic stripe and the second logic stripe; and producing a matrix product by selectably coupling a plurality of subrows to a corresponding plurality of logic stripes in the data path.

18. The method of claim 17 , wherein the method further comprises: performing the first operation using a first compute component of the first logic stripe; and performing the second operation using a second compute component of the second logic stripe.

19. The method of claim 17 , wherein the method further comprises: performing the first operation and the second operation as a first two operations in a number of a plurality of logical operations that corresponds to a number of the plurality of logic stripes; wherein the plurality of logical operations is a sequential plurality of logical operations performed on the data value moved from the memory cell in the first subrow and a number of output data values to produce the matrix product.

20. The method of claim 17 , wherein the method further comprises: moving a result of completion of a last operation of a sequential plurality of logical operations from a last logic stripe to a selected memory cell in a row of the array; wherein the last logic stripe is a logic stripe in which the last operation is performed.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

G06F G11C

Patent Metadata

Filing Date

October 17, 2019

Publication Date

October 27, 2020

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search