Patentable/Patents/US-20260024576-A1
US-20260024576-A1

Multiplexor Placement for Implementing a Processing Unit in Memory

PublishedJanuary 22, 2026
Assigneenot available in USPTO data we have
Technical Abstract

A memory device can include a first error correction code (ECC) circuitry, a second ECC circuitry, and a multiplexor (MUX). The first ECC circuitry receive first data from a first bank. The second ECC circuitry can receive second data from a second bank. A MUX can receive the first data from the first ECC circuitry and the second data from the second ECC circuitry. The MUX can provide the first data in a first portion of a duration of time. The MUX can provide the second data in a second portion of the duration of time. A processing unit (PU) can perform a first plurality of multiplication operations utilizing the first data provided by the MUX during the first portion of the duration of time and a second plurality of multiplication operations utilizing the second data provided by the MUX during the second portion of the duration of time.

Patent Claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

a first error correction code (ECC) circuitry configured to receive first data from a first bank of a memory device; a second ECC circuitry configured to receive second data from a second bank of the memory device; receive the first data from the first ECC circuitry and the second data from the second ECC circuitry; provide the first data in a first portion of a duration of time; provide the second data in a second portion of the duration of time; and a multiplexor (MUX) coupled to the first ECC circuitry and the second ECC circuitry and configured to: perform a first plurality of multiplication operations utilizing the first data provided by the MUX during the first portion of the duration of time; and perform a second plurality of multiplication operations utilizing the second data provided by the MUX during the second portion of the duration of time. a processing unit (PU) coupled to the MUX and configured to: . An apparatus, comprising:

2

claim 1 . The apparatus of, wherein the first bank and the second bank are within a same bank group of the memory device.

3

claim 1 . The apparatus of, further comprising a first data sense amplifier (DSA) configured to provide the first data to the first ECC circuitry in a different duration of time.

4

claim 3 . The apparatus of, wherein the first DSA is configured to receive the first data from the first bank.

5

claim 3 . The apparatus of, further comprising a second DSA configured to provide the second data to the second ECC circuitry in the different duration of time.

6

claim 5 . The apparatus of, wherein the first portion of the duration of time is a first half of the duration of time, wherein the second portion of the duration of time is a second half of the duration of time, and wherein the second DSA is configured to receive the second data from the second bank.

7

claim 1 perform the first plurality of multiplication operations utilizing the first data provided by the MUX during the first portion of the duration of time; and perform the second plurality of multiplication operations utilizing the second data provided by the MUX during the second portion of the duration of time. . The apparatus of, wherein the PU comprises a plurality of multiply-accumulate (MAC) units configured to:

8

claim 1 . The apparatus of, wherein the MUX is further configured to receive additional data after the duration of time.

9

first data sense amplifiers (DSAs) configured to receive first data from a first bank of a memory device; second DSAs configured to receive second data from a second bank of the memory device; receive the first data from the first DSAs and the second data from the second DSAs; provide the first data in a first portion of a duration of time; provide the second data in a second portion of the duration of time; and a multiplexor (MUX) configured to: perform a first plurality of operations utilizing the first data provided by the MUX during the first portion of the duration of time; and perform a second plurality of operations utilizing the second data provided by the MUX during the second portion of the duration of time. an error correction code (ECC) circuitry configured to: . An apparatus, comprising:

10

claim 9 . The apparatus of, wherein the ECC circuitry is further configured to, responsive to performing the first plurality of operations, generate a first output data in the first portion of the duration of time, wherein the first portion of the duration of time is a first half of the duration of time.

11

claim 10 . The apparatus of, wherein the ECC circuitry is further configured to provide the first output data to a processing unit (PU) in the first half of the duration of time.

12

claim 11 . The apparatus of, further comprising the PU configured to perform a first plurality of multiplication operations in a first half of a different duration of time.

13

claim 9 . The apparatus of, wherein the ECC circuitry is further configured to, responsive to performing the second plurality of operations, generate a second output data in the second portion of the duration of time, wherein the second portion of the duration of time is a second half of the duration of time.

14

claim 13 . The apparatus of, wherein the ECC circuitry is further configured to provide the second output data to a processing unit (PU) in the second half of the duration of time.

15

claim 14 . The apparatus of, further comprising the PU configured to perform a second plurality of multiplication operations in a second half of a different duration of time.

16

providing first data from a multiplexor (MUX) of a memory device to an error correction code (ECC) circuitry of the memory device in a first portion of a duration of time; providing second data from the MUX to the ECC circuitry in a second portion of the duration of time; performing, by the ECC circuitry, a first plurality of operations using the first data in the first portion of the duration of time; performing, by the ECC circuitry, a second plurality of operations using the second data in the second portion of the duration of time; providing, by the ECC circuitry, a first output of the first plurality of operations to a processing unit (PU) of the memory device in the first portion of the duration of time; and providing, by the ECC circuitry, a second output of the second plurality of operations to the PU in the second portion of the duration of time. . A method, comprising:

17

claim 16 . The method of, further comprising providing the first data from a first data sense amplifier (DSA) to the MUX in a second duration of time.

18

claim 17 . The method of, further comprising providing the first data from a first bank to the first DSA in the second duration of time, wherein the first portion of the duration of time is a first half of the duration of time and the second portion of the duration of time is a second half of the duration of time.

19

claim 16 . The method of, further comprising providing the second data from a second data sense amplifier (DSA) to the MUX in a second duration of time.

20

claim 19 . The method of, further comprising providing the second data from a second bank to the second DSA in the second duration of time.

Detailed Description

Complete technical specification and implementation details from the patent document.

This application claims the benefit of U.S. Provisional Application No. 63/672,124, filed on Jul. 16, 2024, the contents of which are incorporated herein by reference.

The present disclosure relates generally to memory, and more particularly to multiplexor placement for implementing a processing unit in memory.

Memory devices are typically provided as internal, semiconductor, integrated circuits in computers or other electronic devices. There are many different types of memory including volatile and non-volatile memory. Volatile memory can require power to maintain its data and includes random-access memory (RAM), dynamic random access memory (DRAM), and synchronous dynamic random access memory (SDRAM), among others. Non-volatile memory can provide persistent data by retaining stored data when not powered and can include NAND flash memory, NOR flash memory, read only memory (ROM), Electrically Erasable Programmable ROM (EEPROM), Erasable Programmable ROM (EPROM), and resistance variable memory such as phase change random access memory (PCRAM), resistive random access memory (RRAM), and magnetoresistive random access memory (MRAM), among others.

Memory is also utilized as volatile and non-volatile data storage for a wide range of electronic applications. Non-volatile memory may be used in, for example, personal computers, portable memory sticks, digital cameras, cellular telephones, portable music players such as MP3 players, movie players, and other electronic devices. Memory cells can be arranged into arrays, with the arrays being used in memory devices.

The present disclosure describes multiplexor placement for implementing a processing unit in memory. In various examples, a multiplexor (MUX) can be placed strategically between data sense amplifiers (DSA) and a processing unit (PU) to allow for the die size reduction of the PU.

In some previous approaches, a PU can include a plurality of multiply-accumulate (MAC) units. The plurality of MAC units can receive a plurality of data values. A PU may be implemented using a set quantity of MAC units. For example, the PU may traditionally be implemented using thirty-two MAC units. Each of the MAC units can include an accumulator register that stores thirty-two data values (e.g., bits). Each of the thirty-two MAC units can receive eight bits of data every time data is sensed (e.g., read) from a memory array (e.g., a bank of memory). The read latency can be 5 nanoseconds (ns). Each of the thirty-two MAC units can receive eight bits of data every 5 ns. Each of the eight bits of data can represent a different data value. Each of the thirty-two MAC units can receive a data value every time data is sensed. For example, each of the thirty-two MAC units can receive a data value every 5 ns. A separate PU unit can be implemented for each bank of a memory device. For example, a first bank can be coupled to a first PU and a second bank can be coupled to a second PU.

However, the MAC units may perform MAC operations in less time than the read latency. For example, the MAC units may perform a plurality of operations utilizing the received eight bits of data in less time than the 5 ns read latency. As such, the MAC units, or portions of the MAC units, may be underutilized because the MAC units or portion of the MAC units remain inactive for the remaining portion of the 5 ns. Implementing a PU per bank of the memory device can also utilize more die space than is needed if the PUs are not being fully utilized.

In order to address these and other deficiencies of previous approaches, embodiments of the present disclosure implement a PU that provides data (e.g., data values) to the MAC units such that the MAC units are continually utilized. For example, a single PU can be implemented per pair of banks of a memory device. The single PU can be coupled to the multiple banks of the memory device using a MUX. The placement of the MUX between the PU and the multiple banks can be used to reduce the quantity of DSAs and/or error correction code (ECC) circuitry. Reducing the quantity of DSAs and/or ECC circuitry can reduce the cost of implementing the memory device, can reduce power usage, and/or can reduce die size.

As used herein, a PU can include hardware and/or firmware to perform a plurality of operations. The PU can include MAC units which include hardware and/or firmware for performing a plurality of multiplication operations and a plurality of accumulation operations referred to as MAC operations.

For example, in embodiments of the present disclosure, a MUX can be implemented to provide data values to the PU. The MUX can provide portions of the data values in less time than the read latency. As used herein, the read latency refers to an interval of time starting when first data is sensed from the memory array and ending when second data is sensed from the array. For example, the MUX can provide a first portion and a second portion of the data values during the read latency such that the MAC units are utilized for the entirety of the read latency.

Given that the MAC units remain utilized for the read latency, fewer MAC units can be utilized than are utilized if the MAC units are only partially utilized during the read latency (e.g., as with previous approaches). For example, if thirty-two MAC units are partially utilized during a read latency (e.g., as with previous approaches), then only sixteen MAC units can be fully utilized for the same duration of time with the use of a MUX to continuously provide data to the sixteen MAC units in accordance with embodiments of the present disclosure. As used herein, a MUX can continuously provide data if the MUX provides data multiple times over a particular duration of time. For example, a MUX can continuously provide data values during a read latency if the MUX provides both first data and second data during the read latency, where the first data and the second data are provided separately.

The PU can be used to implement an artificial neural network (ANN) using the MAC units, for example. As used herein, ANNs can provide learning by forming probability weight associations between an input and an output. The probability weight associations can be provided by a plurality of nodes that comprise the ANN. The nodes together with weights, biases, and activation functions can be used to generate an output of the ANN based on the input to the ANN. A plurality of nodes of the ANN can be grouped to form layers of the ANN.

As used herein, artificial intelligence (AI) refers to the ability to improve an apparatus through “learning” such as by storing patterns and/or examples which can be utilized to take actions at a later time. Deep learning refers to a device's ability to learn from data provided as examples. Deep learning can be a subset of AI. Neural networks, among other types of networks, can be classified as deep learning. Improving the efficiency at which ANNs are executed can improve a function of a memory device executing the ANN and the function of the device in which the memory device is implemented. For example, improving the latency, power consumption, and/or throughput of the memory device implementing the ANN can cause an improvement to the latency, power consumption, and/or throughput of a memory device.

As used herein, “a number of” something can refer to one or more of such things. For example, a number of memory devices can refer to one or more memory devices. A “plurality” of something intends two or more. Additionally, designators such as “N,” as used herein, particularly with respect to reference numerals in the drawings, indicates that a number of the particular feature so designated can be included with a number of embodiments of the present disclosure.

The figures herein follow a numbering convention in which the first digit or digits correspond to the drawing figure number and the remaining digits identify an element or component in the drawing. Similar elements or components between different figures may be identified by the use of similar digits. As will be appreciated, elements shown in the various embodiments herein can be added, exchanged, and/or eliminated so as to provide a number of additional embodiments of the present disclosure. In addition, the proportion and the relative scale of the elements provided in the figures are intended to illustrate various embodiments of the present disclosure and are not to be used in a limiting sense.

1 FIG. 100 120 120 130 130 110 is a block diagram of an apparatus in the form of a computing systemincluding a memory devicein accordance with a number of embodiments of the present disclosure. As used herein, a memory device, a bankof memory cells, also referred to as a memory array, a host, and/or the PU might also be separately considered an “apparatus.”

100 110 120 156 100 110 120 100 110 120 110 120 110 120 In this example, systemincludes a hostcoupled to memory devicevia an interface. The computing systemcan be a personal laptop computer, a desktop computer, a digital camera, a mobile telephone, a memory card reader, or an Internet-of-Things (IoT) enabled device, among various other types of systems. Hostcan include a number of processing resources (e.g., one or more processors, microprocessors, or some other type of controlling circuitry) capable of accessing memory. The systemcan include separate integrated circuits, or both the hostand the memory devicecan be on the same integrated circuit. For example, the hostmay be a system controller of a memory system comprising multiple memory devices, with the system controllerproviding access to the respective memory devicesby another processing resource such as a central processing unit (CPU).

1 FIG. 110 120 140 110 156 In the example shown in, the hostis responsible for executing an operating system (OS) and/or various applications that can be loaded thereto (e.g., from memory devicevia controller). The hostcan provide access commands and/or security mode initialization commands to a memory device via the interface.

100 130 130 130 120 130 130 1 FIG. For clarity, the systemhas been simplified to focus on features with particular relevance to the present disclosure. The memory arraycan be a DRAM array, SRAM array, STT RAM array, PCRAM array, TRAM array, RRAM array, NAND flash array, and/or NOR flash array, for instance. The arraycan comprise memory cells arranged in rows coupled by access lines (which may be referred to herein as word lines or select lines) and columns coupled by sense lines (which may be referred to herein as digit lines or data lines). Although a single arrayis shown in, embodiments are not so limited. For instance, memory devicemay include a number of arrays(e.g., a number of banksof DRAM cells).

120 156 156 156 146 152 130 130 130 110 156 130 130 The memory deviceincludes address circuitry to latch address signals provided over the interface. The interfacecan include, for example, a physical interface employing a suitable protocol (e.g., a data bus, an address bus, and a command bus, or a combined data/address/command bus). Such protocol may be custom or proprietary, or the interfacemay employ a standardized protocol, such as Peripheral Component Interconnect Express (PCIe), Gen-Z, CCIX, or the like. Address signals are received and decoded by a row decoderand a column decoderto access the memory array. Data can be read from memory arrayby sensing voltage and/or current changes on the sense lines using sensing circuitry. The sensing circuitry can comprise, for example, sense amplifiers that can read and latch a page (e.g., row) of data from the memory array. The I/O circuitry can be used for bi-directional data communication with hostover the interface. Read/write circuitry is used to write data to the memory arrayor read data from the memory array.

140 110 130 140 110 140 Controllerdecodes signals provided by the host. These signals can include chip enable signals, write enable signals, and address latch signals that are used to control operations performed on the memory array, including data read, data write, and data erase operations. In various embodiments, the controlleris responsible for executing instructions from the host. The controllercan comprise a state machine, a sequencer, and/or some other type of control circuitry, which may be implemented in the form of hardware, firmware, or software, or any combination of the three.

140 110 102 102 130 110 In various instances, the controllercan receive signals provided by the hostincluding signals requesting operations to be performed by the PU. As used herein, the PUcan include hardware, firmware, and/or software for performing operations, such as, for example, multiplication operations, using data provided by the memory arrayand/or the host.

103 152 103 130 103 130 102 102 102 102 102 104 104 156 104 In various examples, error correction code (ECC) circuitrycan be coupled to the column decoder. The ECC circuitrycan receive data from the memory array(e.g., the sensing circuitry of the memory array). The ECC circuitrycan perform error correction operations to correct errors in data sensed from the memory array. The PUcan be coupled to the ECC circuitry. The PUcan perform a plurality of operations on data received from the ECC circuitry. The PUcan provide an output to the data path. The data pathcan provide data to the interface. In various instances, the data pathcan include Input/Output (I/O lines) and/or receivers and/or drivers. As used herein, receivers can include circuitry configured to receive a signal. Drivers can describe circuitry to drive a signal across a line or a plurality of lines.

102 102 130 102 102 102 102 The PUcan include multiple MAC units. The MAC units can perform operations (e.g., multiplication operations) to implement an ANN. The PUcan also include a MUX that receives data values from (e.g., sensed from) memory array(e.g., data that has been corrected by the ECC circuitry). Although the MUX is described as being part of the PU, the MUX can also be external to the PU. In various examples, the MUX can be implemented upstream from the PU(e.g., between the PU and the memory array).

The MUX can provide data values received at the same time continuously to the MAC units. For example, the MUX can receive a plurality of data values (e.g., represented using a quantity of bits) during a duration of time (e.g., during a time period). The MUX can provide a first portion of the data values followed by a second portion of the data values to the MAC units within the time period. Implementing a MUX in a PU to provide data to the MAC units allows for less MAC units to be utilized than implementing the PU without a MUX. Although the implementations described herein utilize a MUX, the examples described herein can be extended to include different circuitry that can receive data, divide the data, and provide the divided data continuously over a period of time. For example, registers can be utilized instead of a MUX to perform the functions of a MUX. Although the examples provided herein are given in the context of data values, the examples described herein can be extended to include bits. For example, the MUX can provide a first portion and a second portion of a plurality of bits that represent data values to MAC units during a time period.

2 FIG. 1 FIG. 2 5 6 FIGS.,, and 220 230 0 230 1 230 2 230 3 230 4 230 5 230 6 230 7 230 8 230 9 230 10 230 11 230 12 230 13 230 14 230 15 230 230 130 is a block diagram of a memory devicehaving a plurality of banks of memory cells in accordance with a number of embodiments of the present disclosure. The banks-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-can be referred to collectively as banks. The bankscan be analogous to bankpreviously described in connection with. Further, although 16 banks are shown in the example illustrated in, embodiments of the present disclosure are not limited to a particular number of banks.

230 221 230 0 230 1 230 8 230 9 230 2 230 3 230 10 230 11 230 4 230 5 230 12 230 13 230 6 230 7 230 14 230 15 The bankscan be grouped into bank groups. For example, the banks-,-,-,-can be grouped into a first bank group (e.g., bank group 0). The banks-,-,-,-can be grouped into a second bank group (e.g., bank group 1). The banks-,-,-,-can be grouped into a third bank group (e.g., bank group 2). The banks-,-,-,-can be grouped into a fourth bank group (e.g., bank group 3).

202 230 0 230 8 202 230 1 230 9 230 2 230 10 230 3 230 11 230 4 230 12 230 5 230 13 230 6 230 14 230 7 230 15 The banks of each respective bank group can be organized into pairs that share at least a PU. For example, the banks-,-of bank group 0 share a first PU (e.g., the PU). The banks-,-of bank group 0 share a second PU. The banks-,-of bank group 1 share a third PU. The banks-,-of bank group 1 share a fourth PU. The banks-,-of bank group 2 share a fifth PU. The banks-,-of bank group 2 share a sixth PU. The banks-,-of bank group 3 share a seventh PU. The banks-,-of bank group 3 share an eighth PU.

2 FIG. 5 FIG. 223 203 223 202 223 203 230 202 202 230 230 0 202 230 8 202 The example ofalso shows the bank pairs sharing data sense amplifiers (e.g., DSA)and error correction circuitryin an analogous manner. The DSAcan also be referred to as sensing circuitry and/or sense amplifiers. In other examples (e.g., the example of), each bank has its own DSA and ECC circuitry but may share the PU. Having separate DSAsand ECC circuitriesallows each of the banksto provide data to the shared PUindependent of the other banks. A shared PUcan function at twice the speed as compared to a PU that is not shared. For example, if a read latency of the banksis 5 ns, then a first bank-can provide data to the PUat the start of the 5 ns and a second bank-can provide data to the PUhalfway through the 5 ns (e.g., 2.5 ns).

202 231 2 231 2 202 202 202 231 2 231 2 202 202 202 202 Each of the PUscan be coupled to or include a MUX-. The MUX-enables the PUto be implemented with less MAC units while retaining the same throughput. For example, if each of the PUsis implemented with sixteen MAC units instead of thirty-two MAC units, then the PUscan be implemented with sixty-four MAC units using the MUXs-instead of one hundred twenty-eight MAC units. The MUXs-allow for the PUsto be implemented with at least half the MAC units than PUs implemented without the MUXs. If 4:1 MUXs are utilized in the PUs, the PUscan be implemented with eight MAC units instead of one hundred twenty-eight MAC units used to implement PUswithout MUXs.

202 202 202 202 Implementing the PUsusing fewer MAC units can decrease the cost of implementing the PUs. Implementing the PUsusing fewer MAC units can also decrease the size of the die that includes the PUs.

230 223 231 1 230 0 230 8 231 1 230 0 256 230 8 256 231 1 Data can be provided from the banksto the DSAsvia a MUX-. For example, the bank-and the bank-can provide data to the MUX-. The bank-can providebits in a first duration of time and the bank-can providebits to the MUX-in a second duration of time.

223 223 512 231 1 256 223 231 1 256 223 223 Given that a shared DSAis implemented, the DSAmay not be configured to receivebits of data. The MUX-can provide the firstbits to the DSAsin a first duration of time. The MUX-can provide the secondbits to the DSAsin a second duration of time. The DSAscan utilize half of the die space as compared to the die space utilized by a first DSA of a first bank and a second DSA of a second bank.

223 203 223 203 During the first duration of time the DSAscan provide the amplified first data to the ECC. During the second duration of time the DSAscan provide the amplified second data to the ECC.

203 231 2 203 231 2 203 231 2 203 The ECCcan perform a plurality of operation on the first data during a third duration of time and can provide a first output to the MUX-. The ECCcan perform a plurality of operations on the second data during a fourth duration of time and can provide a second output to the MUX-. The ECCcan provide a first number of bits (e.g., 128 bits) and a second number of bits (e.g., 128 bits) to the MUX-. The ECCcan utilize half the die space as compared to ECCs that are not shared.

231 2 202 231 2 202 231 2 202 In fifth duration of time, the MUX-can provide the first number of bits and the second number of bits to the PU. For example, the MUX-can provide the first number of bits to the PUin a first portion of the fifth duration of time. The MUX-can provide the second number of bits to the PUin a second portion of the fifth duration of time. The first portion can be a first half of the fifth duration of time and the second portion can be a second half of the fifth duration of time. Although the examples described herein are provided utilizing a first half of a duration of time and a second half of a duration of time. The examples can be applied to portions of a duration of time and are not limited to half portions of the duration of time.

202 202 223 203 223 203 The PUcan function at half the speed as compared to the speed at which a first PU of a first bank and a second PU of a second bank function. The PUcan utilize half of the die space as compared to the die space utilized by a first PU of a first bank and a second PU of a second bank. The DSAsand the ECCscan utilize half the space as compared to the space utilized by the DSAs and ECC coupled to a first bank and the DSAs and ECC coupled to a second bank. The DSAsand the ECCscan also function at half the speed.

3 FIG. 1 FIG. 2 FIG. 2 FIG. 302 102 202 331 331 302 331 302 331 202 203 302 331 332 333 302 302 336 336 336 302 336 302 is a block diagram of a processing unit(e.g., the PUofand/or the PUof) including a MUX(e.g., a 2:1 MUX) in accordance with a number of embodiments of the present disclosure. Although the MUXis shown as being part of the PU, the MUXcan also be implemented external to the PU. For example, the MUcan be implemented between the PUand the ECCof. The PUcan include the MUX(e.g., 2:1 MUX), a shift register, and MAC units. The PUcan receive data from banks (e.g., memory array), as previously described herein. The PUcan also receive data from the data bus. The data buscan include receivers and/or drivers. The data buscan couple the PUto the interface of the memory device (e.g., via a common data bus). The data buscan be used to provide data to the PUfrom a host coupled to the memory device and/or from the banks of the memory device.

3 FIG. 302 336 302 302 302 In the example ofthe PUcan receive data from the banks and/or from the host via the data busonce per read latency period. For example, a bank can provide data to the PUonce every 5 ns. Although the read latency is described as being 5 ns, other latencies can be utilized to describe a duration of time used to provide data to the PU. For example, data can be provided to the PUevery 10 ns or 2.5 ns.

302 332 332 332 332 An operand B can be provided to the PUand stored in the registers. Operand B can comprise two hundred fifty-six bits, which can represent, for example, thirty-two data values. The thirty-two data values can be stored in the registers. For example, the shift registerscan include thirty-two eight bit registers. Each of the registers of the registerscan store eight bits (e.g., a data value). The operand B can be provided from a bank (e.g., DRAM array) of the memory device.

332 332 332 333 333 332 332 333 332 333 The registerscan be shift registers. The shift registerscan provide the same data value (e.g., eight bits) to each of the MAC units. Once a data value (e.g., eight bits) has been provided to the MAC units, the shift registerscan shift a position of the data values such that the next data value (e.g., the next eight bits) is available and the data value previously provided is last in line. The shift registerscan then provide the next data value to each of the MAC units. In such a fashion, the shift registerscan rotate through the thirty-two data values (e.g., rotate through the two hundred fifty-six bits), providing one data value (e.g., eight bits) at a time to the MAC units..

336 336 302 336 The operand A can be provided from the I/O lines via the data bus. The operand A can also comprise thirty-two data values (e.g., two hundred fifty-six bits) that can remain active in the data busfor the duration of the 5 ns. Although the examples described herein are provided in terms of data being provided in two hundred fifty-six bit chunks, other data size chunks can be provided to the PU. For example, the data buscan carry sixteen data values (e.g., one hundred twenty-eight bits) or sixty-four data values (e.g., five hundred twelve bits).

331 331 302 333 336 331 302 The operand A can be provided to the MUX. The MUXcan be implemented internal to the PUand between the MAC unitsand the data bus. The MUXcan be implemented as an interface to the PU.

331 331 333 331 331 331 The MUXcan be a 2:1 MUX that provides a first half of the data values during a first half of the read latency and provides the second half of the data values during a second half of the read latency. For example, the MUXcan provide the first sixteen data values (e.g., one hundred twenty-eight bits) to the MAC unitsduring the first 2.5 ns of the read latency. The MUXcan provide the second sixteen data values (e.g., one hundred twenty-eight bits) to the MAC units during the second 2.5 ns of the read latency. The MUXcan receive a clock signal that enables the MUXto provide data based on the partitioned read latency (e.g., every 2.5 ns). The term partitioned read latency can reference that the read latency is divided to describe different intervals than those conveyed by the read latency.

331 333 333 331 333 333 3 FIG. The MUXcan provide different data values to each of the MAC units. For example, given that there are sixteen MAC unitsin the example of, the MUXcan provide sixteen data values (e.g., one hundred twenty-eight bits) to the MAC unitssuch that each of the MAC unitsreceives a different data value (e.g., eight bits) from the sixteen data values.

333 334 335 335 333 334 333 334 335 335 334 334 335 Each MAC unitcan include a multiplicatorand an accumulator(also referred to as accumulation registers). Each of the MAC unitscan receive a data value from the operand B and a data value from the operand A. The multiplicatorof each respective MAC unitcan perform a plurality of multiplication operations utilizing the received data values from the operand B and the operand A. The output of each respective multiplicatorcan be provided to a different respective accumulator. Each accumulatorcan sum the respective output of the multiplicatorsand the previous outputs of the multiplicators. Each of the accumulatorscan include thirty two-bit registers.

3 FIG. 334 335 333 334 335 302 334 335 302 The example ofshows sixteen multiplicatorsand sixteen accumulators. Reducing the quantity of the MAC unitscan also reduce the quantity of multiplicatorsand the quantity of accumulatorsimplemented in the PU. Reducing the quantity of multiplicatorsand the quantity of accumulatorscan reduce the size and/or cost of the PU.

335 333 335 336 333 331 302 335 335 302 335 336 335 336 The data in the accumulatorscan be read to obtain the output of the MAC units. The interface between the accumulatorsand the data buscan also be updated to accommodate the fewer MAC unitsimplemented in view of the implementation of the MUXin the PU. For example, each of the accumulatorscan output two data values (e.g., sixteen bits) at time. Given that there are sixteen accumulators, the output of the PUcan be thirty-two data values (e.g., two hundred and twenty-six bits). Each of the accumulatorscan be coupled to the data busvia sixteen lines. In contrast, in previous approaches in which thirty-two accumulators were implemented, each of the accumulatorswould be coupled to the data bususing eight lines.

331 333 333 336 333 336 336 333 331 The implementation of the MUXallows for fewer MAC unitsto be implemented which increases the quantity of lines coupling the MAC unitsto the data bus. The increase in the quantity of lines coupling the MAC unitsto the data busallows for the same throughput to be established between the data busand the MAC unitsas compared to implementations where the MUXis not implemented.

333 331 331 333 333 333 Once the MAC unitsconclude performing a plurality of operations on the sixteen data values (e.g. one hundred twenty-eight bits) provided by the MUX, the MUXcan provide the second sixteen data values (e.g., the second one hundred twenty-eight bits) to the MAC units, and operations (e.g., multiplication operations) can be performed on the second sixteen data values in ana analogous manner. The MAC unitscan consistently be utilized in the read latency (e.g., 5 ns) because sixteen data values (e.g., one hundred twenty-eight bits) are provided to the MAC unitsevery 2.5 ns.

4 FIG. 431 402 331 432 433 431 402 402 402 402 436 436 436 402 436 402 is a block diagram of a processing unit including a 4:1 MUXin accordance with a number of embodiments of the present disclosure. The PUcan include the MUX(e.g., 4:1 MUX), shift register, and MAC units. As described, the MUX, although shown as being part of the PU, can be implemented externally to the PU. The PUcan receive data from the banks (e.g., memory array). The PUcan also receive data from the data bus. The data buscan include receivers and/or drivers. The data buscan couple the PUto the interface of the memory device. The data buscan be used to provide data to the PUfrom the host coupled to the memory device and/or from the banks of the memory device.

4 FIG. 402 436 402 In the example ofthe PUcan receive data from the banks and/or from the host via the data busonce every read latency. For example, a bank can provide data to the PUonce every 5 ns.

402 432 432 432 432 An operand B can be provided to the PUand stored in the registers. Operand B can be composed of thirty-two data values. The thirty-two data values can be stored in the registers. For example, the shift registerscan include thirty-two eight bit registers. Each of the registers of the registerscan store a data value (e.g., eight bits). The operand B can be provided from a bank (e.g., DRAM array) of the memory device.

432 432 432 433 433 432 432 433 432 433 The registerscan be shift registers. The shift registerscan provide the same data value to each of the MAC units. Once a data value has been provided to the MAC units, the shift registerscan shift a position of the data values such that the next data value is available and the data value previously provided is last in line. The shift registerscan then provide the next eight bits to each of the MAC units. In such a fashion, the shift registerscan rotate through thirty-two data values, providing a data value at a time to the MAC units..

436 402 402 402 402 402 402 The operand A can be provided from the I/O lines via the data bus. In examples where multiple banks are coupled to the PU, the operand A can be provided to the PUfrom a first bank and operand B can be provided to the PUfrom a second bank. For example, each of the banks can provide data to the PUonce every read latency. However, the banks can be staggered in providing data to the PUsuch that the PUreceives data every 2.5 ns.

436 The operand A can also comprise thirty-two data values that can remain active in the data busfor the duration of the read latency. The read latency can be 5 ns if the operand A is being received from the host or a bank. The read latency can be 2.5 ns if the operand A is being received from a bank and the operand B is received from a different bank.

402 436 Although the examples described herein are provided in terms of data being provided in two hundred fifty-six chunks (e.g., providing thirty-two data values), other size chunks can be provided to the PU. For example, the data buscan carry sixteen data values (e.g., one hundred twenty-eight bits) or sixty-four data values (e.g., five hundred twelve bits).

431 431 402 433 436 431 402 The operand A can be provided to the MUX. The MUXcan be implemented internal to the PUand between the MAC unitsand the data bus. The MUXcan be implemented as an interface to the PU.

431 431 433 431 431 431 431 431 The MUXcan be a 4:1 MUX that provides a first portion of the data values during a first portion the read latency, a second portion of the data values during a second portion of the read latency, a third portion of the data values during a third portion of the read latency, and a fourth portion of the data values during a fourth portion of the read latency. The data values can include the data values of the operand A. For example, the MUXcan provide the first eight data values (e.g., the first sixty-four bits) to the MAC unitsduring the first 1.25 ns of the read latency. The MUXcan provide the second eight data values (e.g., the second sixty-four bits) to the MAC units during the second 1.25 ns of the read latency. The MUXcan provide the third eight data values (e.g., the third sixty-four bits) to the MAC units during the third 1.25 ns of the read latency. The MUXcan provide the fourth data values (e.g., the fourth sixty-four bits) to the MAC units during the fourth 1.25 ns of the read latency. The MUXcan receive a clock signal that enables the MUXto provide data values based on the partitioned read latency (e.g., every 1.25 ns).

431 433 433 431 433 433 4 FIG. The MUXcan provide a different data value to each of the MAC units. For example, given that there are eight MAC unitsin the example of, the MUXcan provide eight data values (e.g., sixty-four bits) to the MAC units, at the same time, such that each of the MAC unitsreceives a different data value from the eight data values.

433 434 435 435 433 434 434 435 435 434 334 335 434 435 433 434 435 402 434 435 402 4 FIG. The MAC unitscan include a multiplicatorand an accumulatoralso referred to as accumulation registers. Each of the MAC unitscan receive a data value from the operand B and a different data value from the operand A. The multiplicatorscan perform a plurality of multiplication operations using the data values from the operand B and the operand A. The output of the multiplicatorscan be provided to the accumulators. The accumulatorscan sum the output of the multiplicatorsand the previous outputs of the multiplicators. The accumulatorscan each include thirty two-bit registers. The example ofshows eight multiplicatorsand eight accumulators. Reducing the quantity of the MAC unitscan also reduce the quantity of multiplicatorsand the quantity of accumulatorsimplemented in the PU. Reducing the quantity of multiplicatorsand the quantity of accumulatorscan reduce the expense of implementing the PU.

435 433 435 436 433 431 402 435 435 402 435 436 435 436 The accumulatorscan be read to obtain the output of the MAC units. The interface between the accumulatorsand the data buscan also be updated to accommodate that fewer MAC unitsare implemented in view of the implementation of the MUX(e.g., 4:1 MUX) in the PU. For example, each of the accumulatorscan output thirty-two bits at time. Given that there are eight accumulators, the output of the PUcan be two hundred and twenty-six bits. Each of the accumulatorscan be coupled to the data busvia thirty-two lines. In previous approaches where thirty-two accumulators were implemented each of the accumulatorsare coupled to the data bususing eight bits.

431 433 4533 436 433 436 436 433 431 The implementation of the MUXallows for fewer MAC unitsto be implemented which increases the quantity of lines coupling the MAC unitsto the data busto retain the same throughput of two hundred fifty-six bits. The increase in the quantity of lines coupling the MAC unitsto the data busallows for a same throughput to be established between the data busand the MAC unitsas compared to implementations where the MUXis not implemented.

433 431 431 433 433 433 Once the MAC unitsconclude performing a plurality of operations on the eight data values provided by the MUX, the MUXcan provide the second eight data values to the MAC units, etc. The MAC unitscan consistently be utilized in the read latency (e.g., 5 ns) because eight data values are provided to the MAC unitsevery 1.25 ns.

402 In various examples, the operand A can be provided from a first bank and a second bank coupled to the PU. Given that the read latency of the first bank is 5 ns and that the read latency of the second bank is also 5 ns. The first bank and the second bank can be configured to provide data at staggered intervals. For example, the first bank can provide data in the first 2.5 ns while the second bank provides data in the second 2.5 ns.

402 3 FIG. The MUX internal to the bankcan be configured as a 2:1 MUX. The MUX can receive the first operand A from the first bank and can provide a first half of the data during the first 1.25 ns of the read latency. The MUX can provide the second half of the first operand A in the second 1.25 ns. The MUX can receive a second operand A from a different bank and can provide a first half of the second operand A during the third 1.25 ns of the read latency. The MUX can provide the second half of the second operand A during the fourth 1.25 ns of the read latency. The sixteen MAC units can receive the operand A and the operand B from the MUX as similarly shown in.

5 FIG. 5 FIG. 2 FIG. 520 531 503 1 503 2 502 530 0 530 1 530 2 530 3 530 4 530 5 530 6 530 7 530 8 530 9 530 10 530 11 530 12 530 13 530 14 530 15 530 530 230 is a block diagram of a memory deviceincluding a MUXcoupled to multiple ECC-,-and a PUin accordance with a number of embodiments of the present disclosure.includes the banks-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-which can be referred to collectively as banks. The banksare analogous to the banksof.

530 521 530 0 530 1 530 8 530 9 530 2 530 3 530 10 530 11 530 4 530 5 530 12 530 13 530 6 530 7 530 14 530 15 The bankscan be grouped into bank groups. For example, the banks-,-,-,-can be grouped into a first bank group (e.g., bank group 0). The banks-,-,-,-can be grouped into a second bank group (e.g., bank group 1). The banks-,-,-,-can be grouped into a third bank group (e.g., bank group 2). The banks-,-,-,-can be grouped into a fourth bank group (e.g., bank group 3).

502 530 0 530 8 202 530 1 530 9 530 2 530 10 530 3 530 11 530 4 530 12 530 5 530 13 530 6 530 14 530 7 530 15 The banks of each respective bank group can be organized into pairs that share a PU. For example, the banks-,-of bank group 0 share a first PU (e.g., the PU). The banks-,-of bank group 0 share a second PU. The banks-,-of bank group 1 share a third PU. The banks-,-of bank group 1 share a fourth PU. The banks-,-of bank group 2 share a fifth PU. The banks-,-of bank group 2 share a sixth PU. The banks-,-of bank group 3 share a seventh PU. The banks-,-of bank group 3 share an eighth PU.

5 FIG. 5 FIG. 523 1 523 2 503 1 503 2 523 1 523 2 503 1 503 2 530 502 The example ofalso shows that the bank pairs do not share DSAs (e.g., DSAs-,-) or ECC circuitry (e.g., ECC circuitry-,-). In the example of, each bank has its own DSA and ECC circuitry but may share the PU. Having separate DSAs-,-and ECC circuitries-,-allows each of the banksto provide data to the shared PUindependent of the other banks.

5 FIG. 256 523 1 523 2 523 1 523 2 503 1 503 2 523 1 503 1 523 2 503 2 In the example of, a first bank can provide data (e.g.,bits) to the DSA-and a second bank can provide data (e.g., 256 bits) to the DSA-concurrently. The DSAs-,-can amplify the sensed data and can provide the amplified data to the ECC circuitries-,-in a first duration of time. For example, the DSA-can provide amplified data to the ECC circuitry-in the first duration of time. The DSA-can also provide amplified data to the ECC circuitry-in the first duration of time.

523 1 523 2 531 523 1 523 1 523 2 523 2 523 1 523 2 531 523 1 523 2 531 The ECC circuitries-,-can perform a plurality of operations and can provide an output of the plurality of operation to the MUXin a second duration of time. For example, the ECC circuitry-can perform a plurality of operation using data (e.g., 256 bits) received from the DSA-in the second duration of time to generate a first output. The ECC circuitry-can perform a plurality of operation using data (e.g., 256 bits) received from the DSA-in the second duration of time to generate a second output. The ECC circuitries-,-can provide the first output and the second output in the second duration of time to the MUX. The ECC circuitries-,-can provide the first output and the second output concurrently to the MUX.

523 503 531 502 503 1 503 2 There are no space saving in the implementation of separate DSAsand ECC circuitriesfor each bank. The use of the MUXto couple a single PUto the ECC circuitries-,-allows for space savings as compared to implementing a separate PU for each bank are implemented.

531 502 502 502 502 502 502 The MUXcan provide the first output (e.g., 256 bits) to the PUin a first half of a third duration of time and the second output (e.g., 256 bits) to the PUin a second half of the third duration of time. The PUcan perform a plurality of operation on the received first output, which becomes a first input to the PU, in the first half of the third duration of time. The PUcan perform a plurality of operation on the received second output, which becomes a second input to the PU, in the second half of the third duration of time.

502 502 The PUis shared between two banks and can function at twice the speed as compared to a PU that is implemented for a single bank. The PU, that is shared between two banks, can also utilize half the die space as compared to the die space utilized by a first PU of a first bank and a second PU of a second bank.

6 FIG. 6 FIG. 2 FIG. 5 FIG. 620 631 623 1 623 2 603 630 0 630 1 630 2 630 3 630 4 630 5 630 6 630 7 630 8 630 9 630 10 630 11 630 12 630 13 630 14 630 15 630 630 230 530 is a block diagram of a memory deviceincluding a MUXcoupled to multiple DSAs-,-and an ECC circuitryin accordance with a number of embodiments of the present disclosure.includes the banks-,-,-,-,-,-,-,-,-,-,-,-,-,-,-,-which can be referred to collectively as banks. The banksare analogous to the banksofand the banksof.

630 621 630 0 630 1 630 8 630 9 630 2 630 3 630 10 630 11 630 4 630 5 630 12 630 13 630 6 630 7 630 14 630 15 The bankscan be grouped into bank groups. For example, the banks-,-,-,-can be grouped into a first bank group (e.g., bank group 0). The banks-,-,-,-can be grouped into a second bank group (e.g., bank group 1). The banks-,-,-,-can be grouped into a third bank group (e.g., bank group 2). The banks-,-,-,-can be grouped into a fourth bank group (e.g., bank group 3).

602 630 0 630 8 502 630 1 630 9 630 2 630 10 630 3 630 11 630 4 630 12 630 5 630 13 630 6 630 14 630 7 630 15 The banks of each respective bank group can be organized into pairs that share a PU. For example, the banks-,-of bank group 0 share a first PU (e.g., the PU). The banks-,-of bank group 0 share a second PU. The banks-,-of bank group 1 share a third PU. The banks-,-of bank group 1 share a fourth PU. The banks-,-of bank group 2 share a fifth PU. The banks-,-of bank group 2 share a sixth PU. The banks-,-of bank group 3 share a seventh PU. The banks-,-of bank group 3 share an eighth PU.

6 FIG. 6 FIG. 623 1 623 2 603 603 602 The example ofalso shows that although the bank pairs do not share a DSA-,-the bank pairs share the ECC circuitry. In the example of, each bank has its own DSA but shares an ECC circuitryand the PUwith a different bank.

6 FIG. 523 1 523 2 523 1 523 2 631 In the example of, a first bank can provide data (e.g., 256 bits) to the DSA-and a second bank can provide data (e.g., 256 bits) to the DSA-. The DSAs-,-can amplify the sensed data and can provide the amplified data to the MUXin a first duration of time.

631 603 631 623 1 631 623 2 The MUXcan provide the amplified data to the ECC circuitryin a first half of a second duration of time and a second half of the second duration of time. For instance, the MUXcan provide first amplified data, received from the DSA-, in the first half of the second duration of time. The MUXcan provide second amplified data, received from the DSA-, in the second half of the second duration of time.

602 602 602 602 602 602 602 The ECC circuitrycan perform a plurality of operations on the first amplified data in the first half of the second duration of time to generate a first output. The ECC circuitrycan perform a plurality of operations on the second amplified data in the second half of the second duration of time to generate a second output. The ECC circuitrycan occupy half the die space as compared to the die space occupied by a first ECC circuitry of a first bank and a second ECC circuitry of a second bank. The ECC circuitrycan also function at twice the speed as compared to the speed at which a first ECC circuitry of a first bank and a second ECC circuitry of a second bank function. The ECC circuitrycan provide the first output (e.g., 256 bits) to the PUin the first half of the second duration of time and the second output (e.g., 256 bits) to the PUin the second half of the second duration of time.

602 602 602 602 The PUcan perform a first plurality of operations on the first output, received as a first input, in a first half of a third duration of time. The PUcan perform the first plurality of operations on the second output, received as second input, in the second half of the third duration of time. The PUcan utilize half of the die space as compared to the die space utilized by a first PU of a first bank and a second PU of a second bank. The PUcan also function at twice the speed as compared to the speed at which the first PU of the first bank and the second PU of the second bank function.

7 FIG. 1 FIG. 780 120 100 illustrates an example flow diagram of a methodfor implementing (e.g., operating) a multiplexor in a processing unit of memory in accordance with a number of embodiments of the present disclosure. The method can be performed by a memory device of a computing system, such as, for instance, memory deviceof computing systempreviously described in connection with.

781 631 620 603 6 FIG. 6 FIG. 6 FIG. At, a MUX (e.g., the MUXof), of a memory device (e.g., the memory deviceof), can provide first data to an ECC circuitry (e.g., the ECC circuitryof) of the memory device in a first portion of a duration of time. The ECC circuitry can be shared between a first bank and a second bank that comprise a bank pair of a bank group of the memory device.

782 At, the MUX can provide second data to the ECC circuitry in a second portion of the duration of time. The ECC circuitry can function at twice the speed such that for each duration of time the ECC circuitry can process two sets of data. The ECC circuitry can utilize half the die space as compared to ECC circuitries that are not shared by multiple banks.

783 784 785 602 786 6 FIG. At, the ECC circuitry can perform a first plurality of operations using the first data in a first portion of the duration of time. At, the ECC circuitry can perform a second plurality of operations using the second data in the second portion of the duration of time. At, the ECC circuitry can provide a first output of the first plurality of operations to a PU (e.g., the PUof) of the memory device in the first portion of the duration of time. At, the ECC circuitry can provide a second output of the second plurality of operations to the PU in the second portion of the duration of time.

The memory device can also include a first DSA coupled to the MUX. The first DSA can provide the first data to the MUX in a second duration of time. The second duration of time can occur before the first duration of time. A first bank can provide the first duration of time to the first DSA in the second duration of time.

The memory device can also include a second DSA coupled to the MUX. The second DSA can provide the second data to the MUX in the second duration of time. A second bank of the memory device can provide the second data to the second DSA in the second duration of time.

In various examples, a first ECC circuitry can receive first data from a first bank of a memory device. A second ECC circuitry can receive second data from a second bank of the memory device. The first ECC circuitry can correspond to a first bank and the second ECC circuitry can correspond to a second bank of the memory device.

The memory device can also include a MUX. The MUX can receive the first data from the first ECC circuitry and the second data from the second ECC circuitry. The MUX can provide the first data in a first portion of a duration of time. The MUX can provide the second data in a second portion of the duration of time.

The memory device can also include a PU. The PU can perform a first plurality of multiplication operations utilizing the first data provided by the MUX during the first portion of the duration of time. The PU can also perform a second plurality of multiplication operations utilizing the second data provided by the MUX during the second portion of the duration of time.

The first bank and the second bank can be in a bank group of the memory device. A first DSA can provide the first data to the first ECC circuitry in a different duration of time. The first DSA can receive the first data from the first bank. A second DSA can provide the second data to the second ECC circuitry in the different duration of time. The first DSA and the second DSA can provide the first data and the second data concurrently. The second DSA can receive the second data from the second bank.

The PU can include a plurality of MAC units. The MAC units can perform the first plurality of multiplication operations utilizing the first data provided by the MUX during the first portion of the duration of time. The MAC units can also perform the second plurality of multiplication operations utilizing the second data provided by the MUX during the second portion of the duration of time.

After providing the first data and the second data, the MUX can receive additional data in a next duration of time. The next duration of time can occur after the duration of time.

In various instances, first DSAs can receive first data from a first bank of a memory device. Second DSAs can receive second data from a second bank of the memory device. A MUX can receive the first data from the first DSA and the second data from the second DSA. The MUX can provide the first data in a first portion of a duration of time and the second data in a second portion of the duration of time. ECC circuitry can perform a first plurality of operations utilizing the first data provided by the MUX during the first portion of the duration of time and can perform a second plurality of operations utilizing the second data provided by the MUX during the second portion of the duration of time.

The ECC circuitry can, responsive to performing the first plurality of operations, generate a first output data in the first portion of the duration of time. The ECC circuitry can also provide the first output data to a PU in the first portion of the duration of time. The PU can perform a first plurality of multiplication operations in a first portion of a different duration of time.

The ECC circuitry can, responsive to performing the second plurality of operations, generate a second output data in the second portion of the duration of time. The ECC circuitry can provide the second output data to a PU in the second portion of the duration of time. The PU can perform a second plurality of multiplication operations in a second portion of a different duration of time.

8 FIG. 1 FIG. 1 FIG. 1 FIG. 890 890 110 120 102 illustrates an example machine of a computer systemwithin which a set of instructions, for causing the machine to perform any one or more of the methodologies discussed herein, can be executed. In some embodiments, the computer systemcan correspond to a host system (e.g., the hostof) that includes, is coupled to, or utilizes a memory device (e.g., the memory deviceof) or can be used to perform the operations of the PU (e.g., the PUof). In alternative embodiments, the machine can be connected (e.g., networked) to other machines in a LAN, an intranet, an extranet, and/or the Internet. The machine can operate in the capacity of a server or a client machine in client-server network environment, as a peer machine in a peer-to-peer (or distributed) network environment, or as a server or a client machine in a cloud computing infrastructure or environment.

The machine can be a personal computer (PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a server, a network router, a switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while a single machine is illustrated, the term “machine” shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein.

890 891 893 897 898 896 The example computer systemincludes a processing device, a main memory(e.g., read-only memory (ROM), flash memory, dynamic random access memory (DRAM) such as synchronous DRAM (SDRAM) or Rambus DRAM (RDRAM), etc.), a static memory(e.g., flash memory, static random access memory (SRAM), etc.), and a data storage system, which communicate with each other via a bus.

891 891 891 892 890 894 895 Processing devicerepresents one or more general-purpose processing devices such as a microprocessor, a central processing unit, or the like. More particularly, the processing device can be a complex instruction set computing (CISC) microprocessor, reduced instruction set computing (RISC) microprocessor, very long instruction word (VLIW) microprocessor, or a processor implementing other instruction sets, or processors implementing a combination of instruction sets. Processing devicecan also be one or more special-purpose processing devices such as an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a digital signal processor (DSP), network processor, or the like. The processing deviceis configured to execute instructionsfor performing the operations and steps discussed herein. The computer systemcan further include a network interface deviceto communicate over the network.

898 899 892 892 893 891 890 893 891 The data storage systemcan include a machine-readable storage medium(also known as a computer-readable medium) on which is stored one or more sets of instructionsor software embodying any one or more of the methodologies or functions described herein. The instructionscan also reside, completely or at least partially, within the main memoryand/or within the processing deviceduring execution thereof by the computer system, the main memoryand the processing devicealso constituting machine-readable storage media.

892 102 899 1 FIG. In one embodiment, the instructionsinclude instructions to implement functionality corresponding to the PUof. While the machine-readable storage mediumis shown in an example embodiment to be a single medium, the term “machine-readable storage medium” should be taken to include a single medium or multiple media that store the one or more sets of instructions. The term “machine-readable storage medium” shall also be taken to include any medium that is capable of storing or encoding a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the present disclosure. The term “machine-readable storage medium” shall accordingly be taken to include, but not be limited to, solid-state memories, optical media, and magnetic media.

Although specific embodiments have been illustrated and described herein, those of ordinary skill in the art will appreciate that an arrangement calculated to achieve the same results can be substituted for the specific embodiments shown. This disclosure is intended to cover adaptations or variations of various embodiments of the present disclosure. It is to be understood that the above description has been made in an illustrative fashion, and not a restrictive one. Combinations of the above embodiments, and other embodiments not specifically described herein will be apparent to those of skill in the art upon reviewing the above description. The scope of the various embodiments of the present disclosure includes other applications in which the above structures and methods are used. Therefore, the scope of various embodiments of the present disclosure should be determined with reference to the appended claims, along with the full range of equivalents to which such claims are entitled.

In the foregoing Detailed Description, various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the disclosed embodiments of the present disclosure have to use more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus, the following claims are hereby incorporated into the Detailed Description, with each claim standing on its own as a separate embodiment.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

Patent Metadata

Filing Date

July 15, 2025

Publication Date

January 22, 2026

Inventors

Glen E. Hush
Peter L. Brown

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “MULTIPLEXOR PLACEMENT FOR IMPLEMENTING A PROCESSING UNIT IN MEMORY” (US-20260024576-A1). https://patentable.app/patents/US-20260024576-A1

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.

MULTIPLEXOR PLACEMENT FOR IMPLEMENTING A PROCESSING UNIT IN MEMORY — Glen E. Hush | Patentable