Device and Method for Enhancing Item Access Bandwidth and Atomic Operation

PublishedJanuary 28, 2020

Assigneenot available in USPTO data we have

InventorsChuang Bao Zhenlin Yan Chunhui Zhang Kang An

Technical Abstract

Patent Claims

19 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A device for improving an item access bandwidth and atomic operation, comprising: a memory storing processor-executable instructions; and a processor arranged to execute the stored processor-executable instructions to perform steps of: after a lookup request is received from a service side, determining whether an address pointed to by the lookup request is identical to an item address stored in a cache; if they are identical, and a valid identifier vld is currently valid, directly returning item data stored in the cache to the service side without initiating a request for looking up an off-chip memory, so as to reduce accessing the off-chip memory; and if they are not identical, initiating a request for looking up the off-chip memory, and process, according to a preset rule, item data returned by the off-chip memory in such a way that an atomic operation existed in item updating can realize a seamless and faultless lookup in an item lookup process, wherein the preset rule is used for determining whether the address pointed to by the lookup request is identical to the item address stored in the cache, comprising any one of the following ways: way 1: if a vld corresponding to a low first-threshold-M bit address is completely valid, and a high second-threshold-N bit address is identical to the item address stored in the cache, returning data in the cache to the service side, and not updating the data in the cache, where if the addresses are not identical, not updating the data in the cache, and sending the data returned by the off-chip memory to the service side; way 2: if the vld corresponding to the low first-threshold-M bit address is partially valid, not updating the data in the cache, and sending the data returned by the off-chip memory to the service side; and way 3: if the vld corresponding to the low first-threshold-M bit address is invalid, updating the data in the cache, and sending the data returned by the off-chip memory to the service side, and wherein both M and N are natural numbers, and a sum of M and N is a bit width requested by the service side.

2. The device according to claim 1 , wherein the processor is arranged to execute the stored processor-executable instructions to further perform steps of: configuring a service item, and for the case of single-burst item update, giving an instruction of writing a single-burst item; and after mediation by a first mediation module, writing a high second-threshold-N bit address of the service item or the item data into the cache by taking low first-threshold-M bit address as an address, setting a vld register corresponding to the address to 1 through a control module, and giving an instruction of updating the off-chip memory to complete the item update.

3. The device according to claim 2 , wherein the processor is arranged to execute the stored processor-executable instructions to further perform steps of: for the case of single-burst item, determining whether the vld corresponding to the low first-threshold-M bit address of the lookup request is valid, and if it is valid, initiating a lookup of the cache by using a low first-threshold-M bit address of the lookup request, and obtaining a lookup result; and parsing the lookup result, comparing a found address with the high second-threshold-N bit address of the lookup request, and if they are identical, directly returning the result from cache lookup to the service side through a distribution module, not initiating the request for looking up the off-chip memory, and reading and discarding data in a lookup information storage module.

4. The device according to claim 3 , wherein the processor is arranged to execute the stored processor-executable instructions to further perform steps of: when the lookup request matches none of the addresses in the cache, initiating the request for looking up the off-chip memory, and after the item data is returned, taking the item address and multiple-burst information out from the lookup information storage module; for the case of single-burst item, determining through the control module whether a vld corresponding to a low first-threshold-M bit address of an address is valid, and if it is valid, reading the cache after mediation of a second mediation module, comparing high second-threshold-N bits of an acquired address with high second-threshold-N bits of the item taken-out address; and if they match, replacing data of a corresponding address with the item data returned from the off-chip memory and writing data back into the cache, and returning data to the service side through the distribution module.

5. The device according to claim 1 , wherein the processor is arranged to execute the stored processor-executable instructions to further perform steps of: configuring a service item, and for the case of multiple-burst item update, giving an instruction of writing a multiple-burst item; after mediation by a first mediation module, writing a high second-threshold-N bit address of the multiple-burst item or the item data into the cache by taking a value obtained by left shifting a low first-threshold-M bit address for 2{circumflex over ( )}S bits as a first particular address, setting a vld corresponding to the first particular address to 0 through a control module, and not giving an instruction of updating the off-chip memory; for a second burst, writing the high second-threshold-N bit address of the multiple-burst item or the item data into the cache by taking a value obtained by left shifting the low first-threshold-M bit address for 2{circumflex over ( )}S bits plus 1 as a second particular address, setting a vld corresponding to the second particular address to 0 through the control module, not giving the instruction of updating the off-chip memory, at the same time, set a vld of a first burst to 1, and give an instruction of updating a vld item; and by analogy, when an address of a penultimate burst returned by the off-chip memory matches an address, obtained by left shifting a low first-threshold-M bit address for 2{circumflex over ( )}S bit, +S−2, setting an vld corresponding to a last burst to 1, and giving an instruction of updating the off-chip memory to complete the item update.

6. The device according to claim 5 , wherein the processor is arranged to execute the stored processor-executable instructions to further perform steps of: for the case of multiple-burst item, and when there are 2{circumflex over ( )}S multiple-burst items, determining through the control module whether vlds corresponding to 2{circumflex over ( )}S contiguous addresses after a low first-threshold-M bit address of the lookup request is left shifted for S bits are valid, and if all of them are valid, continuously initiating 2{circumflex over ( )}S requests for looking up the cache after the low first-threshold-M bit address of the lookup request is left shifted for S bits, and obtaining a lookup result; and parsing the lookup result, compare a found address with a high second-threshold-N bit address of the lookup request, and if they are identical, directly returning spliced results from cache lookup to the service side through a distribution module, not initiating the request for looking up the off-chip memory, and reading and discarding data in a lookup information storage module, wherein S is a natural number.

7. The device according to claim 6 , wherein the processor is arranged to execute the stored processor-executable instructions to further perform steps of: when the lookup request matches none of the addresses in the cache, initiating the request for looking up the off-chip memory, and after the item data is returned, taking the item address and multiple-burst information out from the lookup information storage module; for the case of multiple-burst item, firstly determining through the control module whether the vlds corresponding to 2{circumflex over ( )}S contiguous addresses after a low first-threshold-M bit address of an address is left shifted for S bits are valid, and if all of them are valid, reading the cache after mediation of a second mediation module, and comparing high second-threshold-N bits of an acquired address with high second-threshold-N bits of the taken-out item address; and if they match, replacing data of a corresponding address with the item data returned from the off-chip memory and writing data back into the cache, and returning data to the service side through the distribution module.

8. The device according to claim 7 , wherein the processor is arranged to execute the stored processor-executable instructions to further perform steps of: when the lookup request is received, according to a multiple-burst identifier carried in the lookup request, determining through the control module whether vld corresponding to 2{circumflex over ( )}S contiguous requests after a low first-threshold-M bit address of a service request is left shifted for S bits is valid, and if it is valid, reading data of a corresponding cache, and determining whether a high second-threshold-N bit address of the service request matches the address in an cache; and if they match, directly returning data to the service side, and if they do not match, initiating the request for looking up the off-chip memory.

9. The device according to claim 8 , wherein the processor is arranged to execute the stored processor-executable instructions to further perform steps of: after the item data is returned, reading the lookup information storage module to acquire a lookup request address and a multiple-burst identifier; determining through the control module whether the vld corresponding to 2{circumflex over ( )}S contiguous requests after the low first-threshold-M bit address of the service request is left shifted for S bits is valid, and if it is valid, reading the data of the corresponding cache, and determining whether the high second-threshold-N bit address of the service request matches a service address returned to the cache; if they match, returning the item data in the cache to the service side through the distribution module, not update the item data in the cache, and if they do not match, directly returning the item data in the off-chip memory to the service side through the distribution module, and updating the item data in the cache; and if a vld corresponding to multiple bursts is partially valid, which indicates that the item update is not completed, returning the item data in the off-chip memory to the service side through the distribution module, and not updating the item data in the cache.

10. A method for improving an item access bandwidth and atomic operation, the method comprising: after a lookup request is received from a service side, determining whether an address pointed to by the lookup request is identical to an item address stored in a cache; if they are identical, and a valid identifier vld is currently valid, directly returning item data stored in the cache to the service side without initiating a request for looking up an off-chip memory, so as to reduce accessing the off-chip memory; and if they are not identical, initiating a request for looking up the off-chip memory, and processing, according to a preset rule, item data returned by the off-chip memory in such a way that an atomic operation existed in item updating can realize a seamless and faultless lookup in an item lookup process, wherein the preset rule is used for determining whether the address pointed to by the lookup request is identical to the item address stored in the cache, comprising any one of the following ways: way 1: if a vld corresponding to a low first-threshold-M bit address is completely valid, and a high second-threshold-N bit address is identical to the item address stored in the cache, returning data in the cache to the service side, and not updating the data in the cache; if the addresses are not identical, not updating the data in the cache, and sending the data returned by the off-chip memory to the service side; way 2: if the vld corresponding to the low first-threshold-M bit address is partially valid, not updating the data in the cache, and sending the data returned by the off-chip memory to the service side; and way 3: if the vld corresponding to the low first-threshold-M bit address is invalid, updating the data in the cache, and sending the data returned by the off-chip memory to the service side, and wherein both M and N are natural numbers, and a sum of M and N is a bit width requested by the service side.

11. The method according to claim 10 , further comprising: configuring, by a central processing unit, a service item, and for the case of single-burst item update, giving an instruction of writing a single-burst item; and after mediation by a first mediation module, writing a high second-threshold-N bit address of the service item or the item data into the cache by taking the low first-threshold-M bit address as an address, setting a vld register corresponding to the address to 1 through a control module, and giving an instruction of updating the off-chip memory to complete the item update.

12. The method according to claim 11 , further comprising: for the case of single-burst item, determining, by a comparison module, whether the vld corresponding to the low first-threshold-M bit address of the lookup request is valid, if it is valid, initiating a lookup of the cache by using a low first-threshold-M bit address of the lookup request, and obtaining a lookup result; parsing the lookup result, and comparing a found address with the high second-threshold-N bit address of the lookup request; and if they are identical, directly returning the result from cache lookup to the service side through a distribution module, not initiating the request for looking up the off-chip memory, and reading and discarding data in a lookup information storage module.

13. The method according to claim 12 , further comprising: when the lookup request matches none of the addresses in the cache, initiating, by the comparison module, the request for looking up the off-chip memory, and after the item data is returned, taking the item address and multiple-burst information out from the lookup information storage module; for the case of single-burst item, determining through the control module whether a vld corresponding to a low first-threshold-M bit address of an address is valid; if it is valid, reading the cache after mediation of a second mediation module, and comparing high second-threshold-N bits of an acquired address with high second-threshold-N bits of the taken-out item address; and if they match, replacing data of a corresponding address with the item data returned from the off-chip memory and writing data back into the cache, and returning data to the service side through the distribution module.

14. The method according to claim 10 , further comprising: configuring, by a central processing unit, a service item, and for the case of multiple-burst item update, giving an instruction of writing a multiple-burst item; after mediation of by a first mediation module, writing, by a first burst, a high second-threshold-N bit address of the multiple-burst item or the item data into the cache by taking a value obtained by left shifting a low first-threshold-M bit address for 2{circumflex over ( )}S bits as a first particular address, setting the vld corresponding to the first particular address to 0 through a control module, and not giving the instruction of updating the off-chip memory; for a second burst, writing the high second-threshold-N bit address of the multi-burst item or the item data in the cache by taking a value obtained by left shifting the low first-threshold-M bit address for 2{circumflex over ( )}S bits plus 1 as a second particular address, setting a vld corresponding to the second particular address to 0 through the control module, and not giving the instruction of updating the off-chip memory; at the same time, setting a vld of the first burst to 1, and giving an instruction of updating a vld item; and by analogy, when an address of a penultimate burst returned by the off-chip memory matches an address, obtained by left shifting a low first-threshold-M bit address for 2{circumflex over ( )}S bit, +S−2, setting an vld corresponding to a last burst to 1, and giving the instruction of updating the off-chip memory to complete the item update.

15. The method according to claim 14 , further comprising: for the case of multiple-burst item, and when there are 2{circumflex over ( )}S multiple-burst items, determining through the control module, by a comparison module, whether vlds corresponding to 2{circumflex over ( )}S contiguous addresses after a low first-threshold-M bit address of the lookup request is left shifted for S bits are valid; if all of them are valid, continuously initiating 2{circumflex over ( )}S requests for looking up the cache after left shifting the low first-threshold-M bit address of the lookup request for S bits, and obtaining a lookup result; parsing the lookup result, and comparing a found address with a high second-threshold-N bit address of the lookup request; and if they are identical, directly returning spliced results from cache lookup to the service side, not initiating the request for looking up the off-chip memory, and reading and discarding the data in a lookup information storage module, wherein S is a natural number.

16. The method according to claim 15 , further comprising: when the lookup request matches none of the addresses in the cache, initiating, by the comparison module, the request for looking up the off-chip memory, and after the item data is returned, taking the item address and multiple-burst information out from the lookup information storage module; for the case of multiple-burst item, first determining through the control module whether the vlds corresponding to 2{circumflex over ( )}S contiguous addresses after a low first-threshold-M bit address of an address is left shifted for S bits are valid; if all of them are valid, reading the cache after mediation of a second mediation module, and comparing high second-threshold-N bits of an acquired address with high second-threshold-N bits of the taken-out address; and if they match, replacing data of a corresponding address with the item data returned from the off-chip memory and writing data back into the cache, and returning data to the service side through a distribution module.

17. The method according to claim 16 , further comprising: when the lookup request is received, determining through the control module, by the comparison module, whether vld corresponding to 2{circumflex over ( )}S contiguous requests after a low first-threshold-M bit address of a service request is left shifted for S bits is valid according to a multiple-burst identifier carried in the lookup request; if it is valid, reading data of a corresponding cache, and determining whether a high second-threshold-N bit address of the service request matches an address in the cache; and if they match, directly returning data to the service side; and if they do not match, initiating the request for looking up the off-chip memory.

18. The method according to claim 17 , further comprising: after the item data is returned, reading, by the comparison module, the lookup information storage module to acquire a lookup request address and a multiple-burst identifier; determining through the control module whether vld corresponding to 2{circumflex over ( )}S contiguous requests after the low first-threshold-M bit address of the service request is left shifted for S bits is valid; if all of them are valid, reading the data of the corresponding cache; determining whether the high second-threshold-N bit address of the service request matches a service address returned to the cache; if they match, returning the item data in the cache to the service side through the distribution module, and not updating the item data in the cache; if they do not match, directly returning the item data in the off-chip memory to the service side through the distribution module, and updating the item data in the cache; and if a vld corresponding to multiple-bursts is partially valid, which indicates that the item update is not completed, returning the item data in the off-chip memory to the service side through the distribution module, and not updating the item data in the cache.

19. A non-transitory computer storage medium having stored therein computer executable instructions arranged to perform a method for improving an item access bandwidth and atomic operation, the method comprising: after a lookup request is received from a service side, determining whether an address pointed to by the lookup request is identical to an item address stored in a cache; if they are identical, and a valid identifier vld is currently valid, directly returning item data stored in the cache to the service side without initiating a request for looking up an off-chip memory, so as to reduce accessing the off-chip memory; if they are not identical, initiating a request for looking up the off-chip memory, and processing, according to a preset rule, item data returned by the off-chip memory in such a way that an atomic operation existed in item updating can realize a seamless and faultless lookup in an item lookup process, wherein the preset rule is used for determining whether the address pointed to by the lookup request is identical to the item address stored in the cache, comprising any one of the following ways: way 1: if a vld corresponding to a low first-threshold-M bit address is completely valid, and a high second-threshold-N bit address is identical to the item address stored in the cache, returning data in the cache to the service side, and not updating the data in the cache; if the addresses are not identical, not updating the data in the cache, and sending the data returned by the off-chip memory to the service side; way 2: if the vld corresponding to the low first-threshold-M bit address is partially valid, not updating the data in the cache, and sending the data returned by the off-chip memory to the service side; and way 3: if the vld corresponding to the low first-threshold-M bit address is invalid, updating the data in the cache, and sending the data returned by the off-chip memory to the service side, and wherein both M and N are natural numbers, and a sum of M and N is a bit width requested by the service side.

Patent Metadata

Filing Date

Unknown

Publication Date

January 28, 2020

Inventors

Chuang Bao

Zhenlin Yan

Chunhui Zhang

Kang An

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search