A memory circuitry with optimized error correction code architecture includes a memory array; a wordline driver, the wordline driver coupled to receive an address and select a corresponding wordline of the memory array; read circuitry coupled to bitlines of the memory array; and error correction code (ECC) logic coupled to receive outputs of a set of columns selected by the read circuitry. The ECC logic performs a partial decoding of the outputs to output a partial ECC result that represents if there is an error and where that error is located.
Legal claims defining the scope of protection, as filed with the USPTO.
. A memory circuitry comprising:
. The memory circuitry of,
. The memory circuitry of,
. The memory circuitry of, wherein the ECC logic comprises syndrome computation logic and the partial ECC result is a syndrome.
. The memory circuitry of, wherein the ECC logic comprises part of a hamming code logic.
. The memory circuitry of, wherein the ECC logic comprises partial decoding operations.
. The memory circuitry of, wherein the ECC logic comprises XOR gates.
. The memory circuitry of, further comprising hit circuitry structured to receive outputs of a second set of columns selected by the read circuitry and a set of tag bits of a received address for lookup for comparison.
. A method of operating a memory circuitry, the method comprising:
. The method of, wherein storing the data and ECC bits in the memory array comprises:
. The method of, wherein the set of columns selected by the read circuitry correspond to all columns of the memory array, wherein the partial ECC result represents if there is an error in any of the preamble bits of any of the plurality of ways in the row and where that error is located in the row.
. The method of, wherein storing the data and ECC bits in the memory array comprises:
. The method of, wherein the set of columns selected by the read circuitry correspond to columns storing the prologue bits, the memory data information of the way, and the ECC bits, wherein the partial ECC result represents if there is an error in data for the way and where that error is located in the row.
. The method of, wherein storing the data and ECC bits in the memory array comprises:
. The method of,
. The method of,
. The method of, wherein the ECC logic comprises syndrome computation logic and the partial ECC result is a syndrome.
. The method of, wherein the ECC logic comprises part of a hamming code logic.
. The method of, wherein the ECC logic comprises partial decoding operations.
. The method of, wherein the ECC logic comprises XOR gates.
Complete technical specification and implementation details from the patent document.
Cache memory and other memory subsystems can be located relatively close to a processor to provide fast access of frequently used data to the processor. Random Access Memory (RAM), and specifically Static Random Access Memory (SRAM), is typically the type of memory used for these memory subsystems. SRAM is generally configured as an array, or matrix, of memory units that are individually addressable.
Memory can be set-associative and organized by index and way. A cacheline refers to the data corresponding to a memory address. A set refers to a limited number of places in the memory where a cacheline can reside (e.g., if associativity is equal to 1, the memory is considered to be “direct mapped”). Each associativity corresponds to a “way”. For example, an associativity of 2 corresponds to two ways, an associativity of 4 corresponds to four ways, and an associativity of 16 corresponds to 16 ways. The index indicates which set a cacheline is stored or is to be stored into and is computed from the address. A tag refers to part of the address that is stored in the tag RAM and identifies, in conjunction with the index, the memory address that the cacheline corresponds with.
To find whether a memory address is in the cache memory or other memory subsystem, a lookup operation can be performed in the tag RAMs. As part of the lookup operation, a portion of an incoming address (e.g., the portion providing the tag function) is compared to the stored tags in the tag RAMs. A “hit” occurs when the incoming address (e.g., the portion providing the tag function) matches a stored tag in a way and the stored tag is considered valid (e.g., as per appropriate state bits(s)). In a typical n-way set-associative cache, data belonging to an address will be in 0 or 1 of n places. Based on the hit of the incoming tag portion with a tag in the tag RAM, the appropriate data RAM can be accessed. For a typical way-halting cache there is an attempt to reduce the number of bits of the tags that are accessed in each way. Thus, if there is any partial mismatch during the lookup (a “miss”), accesses to that way are halted, saving power by not accessing the full tag address lookup.
Accessing memory, such as RAM, utilizes large amounts of energy when multiple ways are accessed all at once using an incoming address to find a matching address that may be in one way of the memory. A process that can locate the desired tag while accessing a minimal number of ways has the potential to save a substantial amount of energy.
Optimized error correction code (ECC) architecture for memory is provided. ECC bits are often stored with data in memory to assist with detecting (and possibly correcting) bit flips in the stored data. The described ECC architecture incorporates certain ECC logic within a memory, enabling power savings and, in some cases, faster operations.
A memory circuitry with optimized ECC architecture includes a memory array; a wordline driver, the wordline driver coupled to receive an address and select a corresponding wordline of the memory array; read circuitry coupled to bitlines of the memory array; and ECC logic coupled to receive outputs of a set of columns selected by the read circuitry. The ECC logic performs a partial decoding of the outputs to output a partial ECC result that represents if there is an error and where that error is located.
A method of operating a memory circuitry with optimized ECC architecture includes storing data and ECC bits encoding error information of the data in a memory array of the memory circuitry; receiving an address at the memory circuitry; selecting, by a wordline driver coupled to receive the address, a corresponding wordline of the memory array; and performing, by ECC logic of the memory circuitry, a partial decoding of output of a set of columns selected by read circuitry of the memory circuitry to output a partial ECC result that represents if there is an error and where that error is located. The set of columns selected by the read circuitry correspond to columns in which the data and ECC bits are stored in the memory array.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Optimized error correction code (ECC) architecture for memory is provided. ECC bits are often stored with data in memory to assist with detecting (and possibly correcting) bit flips in the stored data. The described ECC architecture incorporates certain ECC logic within a memory, enabling power savings and, in some cases, faster operations.
Traditional ECC can be extremely wasteful as the ECC bits are just added to the data fields stored in the memory (e.g., across a full way or set) and are read out during an access along with the data for transport to the logic performing the ECC calculations. By incorporating some ECC logic in memory circuitry, it is possible to avoid reading out all the data from the memory. As such, the switching power of the full calculation of ECC and the power associated with transport of the data from memory across a chip to the logic performing the full calculation of ECC outside the memory can be reduced. In addition, when the ECC logic described herein is incorporated in memory storing tag data of one or more ways, it is possible to implement tag way halting optimized for speed and power savings.
illustrates a representational diagram of a memory circuitry with optimized ECC architecture. Referring to, a memory circuitrywith optimized ECC architecture includes a memory array; a wordline driver; read circuitrycoupled to bitlines of the memory array; and error correction code (ECC) logiccoupled to receive outputs of a set of columns selected by the read circuitry. The wordline driver, read circuitry, and ECC logiccan operate under control of a control circuit.
Control circuitreceives a clock signal and a control/enable signal, among other inputs, and generates outputs to control the other circuitry of the memory circuitry. The clock signal indicates the start of an operation for memory and provides the operating frequency for the circuitry. The control/enable signal can indicate a start of an operation for memory whether the memory will be accessed or not. Other inputs to the control circuit can include address information (not shown) so that column select signals can be generated by the control circuit.
The wordline driveris coupled to receive an address and select a corresponding wordline of the memory array.
The ECC logicperforms a partial decoding of the outputs to output a partial ECC result that represents if there is an error and where that error is located. The ECC logic performs part of an ECC algorithm. That is, ECC logicis not the logic for the full ECC algorithm from which the ECC logic forms a part.
The ECC logiccan include syndrome computation logic (with the output being a syndrome). In some cases, the ECC logicincludes part of a hamming code logic. In some cases, the ECC logicincludes partial decoding operations. An example of ECC logicis shown in. In some cases, the ECC logicincludes XOR gates.
Accordingly, a method of operating a memory circuitrywith optimized error correction code architecture can include storing data and ECC bits encoding error information of the data in a memory arrayof the memory circuitry; receiving an address at the memory circuitry; selecting, by a wordline drivercoupled to receive the address, a corresponding wordline of the memory array; and performing, by ECC logicof the memory circuitry, a partial decoding of output of a set of columns selected by read circuitryof the memory circuitry to output a partial ECC result that represents if there is an error and where that error is located (e.g., in a row corresponding to the corresponding wordline selected by the wordline driver), wherein the set of columns selected by the read circuitrycorrespond to columns in which the data and ECC bits are stored in the memory array.
In some cases, the set of columns to which the ECC logicis coupled is a specific number of columns set aside for storing ECC bits in the memory array. In some cases, the memory arrayis structured to store a plurality of ECC bits, for example, up to 6 or 7 ECC bits in a row, the set of columns coupled to the ECC logic being 6 or 7 columns corresponding to a location of the 6 or 7 ECC bits of each row.
illustrates an example ECC logic providing a partial ECC result; andillustrates example circuitry implementing an error correcting code algorithm in which the ECC logic ofcan be used.
Referring to, an ECC logicis shown in a simplified example for covering errors that may arise in four data bits d, d, d, and d. Here, three ECC bits p, p, and pare used to encode error information. The data bits and ECC bits are input to XOR gates to output a partial ECC result. In the illustrated example, three-input XOR gates are used for the seven bits (four data bits and three ECC bits) to output three bits S, S, and S. Here, the partial ECC resultcan be output of a first stage of a hamming code-based ECC algorithm and the three bits S, S, and Scan be referred to as the syndrome.
As illustrated in more detail in, an ECC algorithm using a hamming code includes encoding logicand decoding logic. The decoding logiccan include syndrome computation (e.g., as implemented by the ECC logic), syndrome matching, and data correction. When data is stored in memory, the error information for the data is encoded in the form ECC bits p,, and pusing encoding logicand stored along with the data in the memory. In some cases, ECC bits are provided for a particular data (e.g., for the data of a tag used in tag way halting) or across an entire row of data (e.g., which may contain information of multiple tags/ways).
Typically, the data and the ECC bits are read out from the memory and transferred to logic functions of the decoding logicin a subsequent step (which may or may not be performed in parallel with next operations of a system and/or offline). However, as described herein, a portion of the decoding logic can be included in the memory itself (e.g., ECC logicin memory circuitry) such that a partial ECC resultfrom ECC logicis read out of the memory. Although an example hamming code ECC algorithm is shown, other error correcting code algorithms may be used (whether hamming code-based or not). Example ECC algorithms that may be used include, but are not limited to, hamming codes such as two-bit detect one-bit correct. Advantageously, by including the ECC logic of part of an ECC algorithm, it is not a requirement to read out both the data and the ECC bits in order to check for errors in the data. This capability enables power savings and supports the use of a two-phase tag way halting architecture.
As explained above, as part of a lookup operation to determine whether an address can be found in a cache near a processing unit, a portion of an incoming address (e.g., the portion providing the tag function) is compared to the stored data forming the tag in each way. A “hit” occurs when the incoming address data (e.g., the portion providing the tag function) matches the stored data (e.g., the “tag”) in a way and the stored data is considered valid (e.g., as per appropriate state bit(s)). In a typical n-way set-associative cache, data belonging to an address will be in 0 or 1 of n places. Based on the hit of the incoming address and data of the tag RAMs, the data RAM cache memory location at the address of the matching stored data can be accessed. For a typical way-halting cache there is an attempt to reduce the number of bits of the tags that are accessed in each way. Thus, if there is any partial mismatch during the lookup (a “miss”), accesses to that way are halted, saving power.
Current way halting techniques and configurations can suffer from high energy consumption and area overhead due to duplication of efforts across many ways (e.g., as part of additional circuitry and parallel operations) and can suffer delay penalties due to routing hit signals across a chip to different banks and memories. In addition, the power consumption due to parallel accesses of multiple memories can be an issue. Current way halting techniques are frequency limiting by looking up the entire tag in the same access cycle. This creates a long cycletime and makes it unusable in modern designs.
In a two-phase tag way halting architecture as presented herein, a first part of the tag lookup is used to filter accesses to ways containing bits of the tag for the second part of the tag lookup by inhibiting access to memory of the ways that mismatch. The first part of the tag lookup uses a first set of bits of the tag and can be referred to as “preamble bits” or “preamble”. The second part of the tag lookup uses a second set of bits of the tag and can be referred to as “prologue bits” or “prologue”.
shows a simplistic representation of a proposed two-phase access utilizing a memory architecture as described herein.
Referring to, an n-way cacheof a proposed memory architecture can include one or more preamble tag memories (e.g., preamble tag RAM) and one or more prologue tag memories/RAMsfor each preamble tag RAM(where n is an integer greater than or equal to 1). A two-phase access is enabled by using the preamble tag RAMto control access to the prologue tag memoriesfor the n ways.
First, a hit or miss of a first set of bits (e.g., preamble-A) of a tag portionof an addresswith respect to each way of a plurality of ways is determined at the preamble tag RAMusing the preamble-A and an index portionof the address. Then, for each hit of the first set of bits, a corresponding way with stored prologue bits of the tags and remaining memory data information of the addresses is accessed and a hit or miss of the prologue-B of the tag portionwith respect to that corresponding way is determined using the prologue-B and the index portionof the addressfor lookup (e.g., with appropriate prologue tag memory accessed as enabled by selection logiccoupled to the prologue tag memoriesthat enables access to each of the prologue tag memoriesunder control of a hit or miss signal(s)output from the preamble tag RAM).
In that manner, only the ways that correspond to the partial hit from the preamble tag RAMare accessed in the prologue tag memory and the prologue-B of the addressis used to determine a fully complete, combined hit or miss for the address. An example implementation of preamble tag RAMis shown in. An example implementation of a prologue tag memoryis shown in.
It should be understood that while n prologue tag RAMs are shown for n ways for illustrative purposes, more than one way may be combined in a same RAM. For example, two or more ways may be combined into one RAM. In addition, in some cases, more than one preamble tag RAM is provided in order to be able to store the preambles of all the ways.
illustrates a representational diagram of a memory circuitry that can be used in a first phase of tag way-halting as described herein. Referring to, memory circuitryincludes a memory array, a control circuit, wordline driver, input/output circuitry, and hit circuitry. Similar to memory circuitryof, memory circuitryincludes optimized ECC architecture through the inclusion of ECC logic. ECC logicperforms a partial decoding of the outputs to output a partial ECC result that represents if there is an error and where that error is located. The ECC logiccan include syndrome computation logic (with the output being a syndrome) such as described with respect to.
The memory arrayis structured in an array of bitcells with rows accessed by wordlines and columns accessed by bitlines. Each bitcell refers to the memory element storing a single bit of information. In certain implementations, memory arrayis static random-access memory (SRAM). The control circuitprovides control signals for operations of the memory circuitry. The wordline driverreceives an address and turns on a wordline indicated by the address in response to receiving a signal from the control circuit. The input/output circuitrycontains the read circuitry and write circuitry that utilize bitlines to read and write data out of and into the memory array. The hit circuitrysupports the determination of a hit/miss of the tag bits. ECC logicsupports certain parts of error correction processes within the memory circuitry. For example, ECC logiccan be coupled to receive outputs of a set of columns selected by read circuitry of the input/output circuitry, wherein the ECC logic performs a partial decoding of the outputs to output a partial ECC result that represents if there is an error and where that error is located in a row.
Accordingly, in the architecture of the n-way cachedescribed with respect to, memory arrayfunctions as preamble tag RAMby storing a set of tag bits of each of a plurality of the ways (e.g., the preamble portion). In addition to the preamble portions of a plurality of ways stored in each row of the memory array, a set of ECC bits are stored in each row. The ECC bits can encode any errors found in the entire row of data.illustrates example data that may be stored in a memory arrayimplementing the preamble tag RAM. It can be seen with reference tothat a memory storing 4 bits for each of 16 ways is sufficiently covered by 6 ECC bits and that ECC logiccan be structured to generate a syndrome using the 70 bits of the data in a row.
In some cases, the set of tag bits of all the n ways are able to be stored in the memory array. In cases where the set of tag bits of all of the n ways are not able to be stored in the memory array(e.g., due to there being more bits than available space), additional memory circuitry(e.g., additional preamble tag RAM) can be provided for the preamble portions.
The first set of bits (e.g., the preamble) from the tag portionof an addressis used by the hit circuitryfor determining a hit or miss of the first set of bits with respect to each way of the plurality of the ways covered by memory circuitry. For example, the hit circuitrycan be coupled to receive outputs of a second set of columns selected by the read circuitry of the input/output circuitryand a set of tag bits of a received address for lookup for comparison.
Address bits (“index portion”) from set portionare used to select the appropriate wordline by wordline driver. The ECC logicuses the ECC bits stored in the memory arrayto carry out a partial operation of ECC operations (e.g., at least a portion of a detection operation). ECC bits are used to determine the integrity of the data (e.g., whether a value has flipped such as due to radiation, etc.) and can be used to perform error correction.
Accordingly, a method of operating a memory circuitrywith optimized error correction code architecture can include storing data and ECC bits encoding error information of the data in a memory arrayof the memory circuitry; receiving an address (e.g., index bits from address) at the memory circuitry; selecting, by a wordline drivercoupled to receive the address, a corresponding wordline of the memory array; and performing, by ECC logicof the memory circuitry, a partial decoding of output of a set of columns selected by read circuitry of the input/output circuitryof the memory circuitryto output a partial ECC result that represents if there is an error and where that error is located in a row corresponding to the corresponding wordline selected by the wordline driver, wherein the set of columns selected by the read circuitry of the input/output circuitrycorrespond to columns in which the data and ECC bits are stored in the memory array.
The storing of the data and ECC bits in the memory array can include loading (e.g., using write circuitry of the input/output circuitry) preamble bits of a plurality of ways in the row of the memory arrayand loading the ECC bits in the row, wherein the ECC bits encode error information across all bits of the preamble bits of the plurality of ways in the row. Then, when reading from the memory circuitry, the set of columns selected by the read circuitry correspond to the entire row, wherein the partial ECC result represents if there is an error in any of the preamble bits of any of the plurality of ways in the row and where that error is located in the row.
Advantageously, by incorporating the hit circuitryand ECC logicin memory circuitry, determining a hit or miss of the first set of bits with respect to each way of a plurality of ways and performing a partial error correction code operation can be performed in a same stage as a read operation of the memory circuitry. Furthermore, since the ECC logicis included in the memory circuitry, the preamble bits and the ECC bits of an entire row do not need to be read out of the memory and transported across the chip to perform the error correction code algorithm logic. Rather, just the partial ECC result is output, which enables power savings and enables the benefits of performing hit operations within the memory circuitry to be fully realized.
illustrates a representational diagram of a memory circuitry that can be used in a second phase of tag way-halting as described herein. Referring to, memory circuitryincludes a memory array, a control circuit, wordline driver, input/output circuitry, hit circuitry, and ECC logic. Memory array, control circuit, wordline driver, and input/output circuitrycan be implemented such as described with respect to memory array, control circuit, wordline driver, and input/output circuitryas described with respect to. In addition, similar to that described with respect to, the hit circuitryand ECC logicsupports the determination of a hit/miss of the tag bits for a way and certain parts of error correction processes within the memory circuitry. For example, ECC logiccan be coupled to receive outputs of a set of columns selected by the read circuitry of the input/output circuitry, wherein the ECC logic performs a partial decoding of the outputs to output a partial ECC result that represents if there is an error and where that error is located in a row. However, different than that described with respect to, the hit circuitryand ECC logiccan be structured to support operations with respect to the prologue bits and memory data information that are to be stored in memory circuitry.
As mentioned above, for each partial hit of the preamble performed in the first phase, a corresponding way is accessed, and determination of a hit or miss is performed using the prologue bits. Here, in the architecture of the n-way cachedescribed with respect to, memory arraystores the prologue portion of a tag and other bits of the address/memory data information in the RAM corresponding to that way.illustrates example data that may be stored in a memory arrayof a memory storing prologue bits (e.g., prologue tag memory). Accordingly, the second set of bits (e.g., prologue-B) from the tag portioncan be used by the hit circuitryto determine a hit or miss of the prologue bits. For example, the hit circuitrycan be coupled to receive outputs of a second set of columns selected by the read circuitry of the input/output circuitryand a set of tag bits (e.g., the prologue-B) of a received address for lookup for comparison. In this way, the prologue bits are only accessed in the second phase when there is a partial hit on the preamble bits. Similar to that described with respect to, address bits (“index portion”) from set portionare used to select the appropriate wordline by wordline driver.
The ECC logicuses the ECC bits stored in the memory arrayto carry out a partial operation of ECC operations (e.g., at least a portion of a detection operation). It can be seen with reference tothat a memory storing 9 bits of prologue and 22 state bits for a way is sufficiently covered by 6 ECC bits and that ECC logiccan be structured to generate a syndrome using the 37 bits of the data for one way. Furthermore, it can be possible to use the 6 ECC bits to cover more than one way if more than one way is stored in a row. For example, 6 ECC bits could cover two ways where each way includes the 9 bits of prologue and 22 state bits and the ECC logicis structured to generate a syndrome using the 68 bits.
Accordingly, a method of operating a memory circuitrywith optimized error correction code architecture can include storing data and ECC bits encoding error information of the data in a memory arrayof the memory circuitry; receiving an address (e.g., index bits from address) at the memory circuitry; selecting, by a wordline drivercoupled to receive the address, a corresponding wordline of the memory array; and performing, by ECC logicof the memory circuitry, a partial decoding of output of a set of columns selected by read circuitry of the input/output circuitryof the memory circuitryto output a partial ECC result that represents if there is an error and where that error is located in a row corresponding to the corresponding wordline selected by the wordline driver, wherein the set of columns selected by the read circuitry of the input/output circuitrycorrespond to columns in which the data and ECC bits are stored in the memory array.
In some cases, the storing the data and ECC bits in the memory array can include loading (e.g., using write circuitry of the input/output circuitry) prologue bits and memory data information of a way of a plurality of ways in the row of the memory array and loading the ECC bits in the row, wherein the ECC bits encode error information across all bits of the way in the row. In some cases, the set of columns selected by the read circuitry correspond to columns storing the prologue bits, the memory data information of the way, and the ECC bits for the data of the way, wherein the partial ECC result represents if there is an error in data for the way and where that error is located in the row.
In some cases, the storing the data and ECC bits in the memory array can include loading (e.g., using write circuitry of the input/output circuitry) prologue bits and memory data information of two ways of a plurality of ways in the row of the memory array and loading the ECC bits in the row, wherein the ECC bits encode error information across all bits of the two ways in the row.
Advantageously, by incorporating the hit circuitryand ECC logicin memory circuitry, determining a hit or miss of the remaining bits from the tag portion of the address at a particular way and performing a partial error correction code operation can be performed in a same subsequent cycle to the first phase and this subsequent cycle can be part of a read operation of the memory circuitry.
Accordingly, by incorporating additional logic within the RAM used for a Way Halting Cache, it is possible to minimize the timing delays caused by the slow speed of current memories as compared to the increased operational speed of logic circuitry when having to first read out all of the bits in the RAM before performing logic operations to complete a lookup operation in the Way Halting Cache. Furthermore, by reducing the number of RAMs being accessed, additional power savings can be achieved.
illustrates an example of data that may be stored in a memory array of a way halting cache as described herein. Referring to, data within memory arraycan include the preamble bitsfrom a plurality of ways (and may include the preamble bits from all available ways). In the example, preamble bits of a 16-way cache are shown. Here, four bits of the tag (b0, b1, b2, b3) are stored as the preamble for each way (Way0, Way1, . . . , Way 15) in a row of the memory array. In addition, ECC bitsare stored, covering the preambles of all sixteen ways. In such a case, 6 ECC bits may be used as an example.
Accordingly, with reference to bothand, hit circuitrycan compare () all the preamble bits in the row to the preamble bitsfrom the address. For example, for row, preamble bits-A of Way0, preamble bits-B of Way1, all the way to preamble bits-O of Way15 are each compared () to preamble bits(e.g., of tagof address). In addition, the ECC logiccan be used to perform a first partial error correction code operation () utilizing the ECC bitsfor that row.
illustrates another example of data that may be stored in a memory array of a way halting cache as described herein. Referring to, data within memory arraycan include the prologue bits, memory data information, and ECC bitsfor each row (whether one or more ways are in the RAM) or per way in a row. In the example, 9prologue bits (based on 4 preamble bits of a 13-bit tag being in a preamble tag RAM), 22 bits of the remaining address information, and corresponding ECC bits are stored in each entry. Six ECC bits may be used as an example.
Accordingly, with reference to bothand, hit circuitrycan compare () the prologue bitsof an entry (e.g., a row) to the prologue bitsfrom the address. In addition, the ECC logiccan be used to perform a first partial error correction code operation () utilizing the ECC bitsfor that entry (e.g., covering the prologue bits and remaining address information).
Unknown
November 27, 2025
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.