Patentable/Patents/US-10853078
US-10853078

Method and apparatus for supporting speculative memory optimizations

PublishedDecember 1, 2020
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

A processor includes a store buffer to store store instructions to be processed to store data in main memory, a load buffer to store load instructions to be processed to load data from main memory, and a loop invariant code motion (LICM) protection structure coupled to the store buffer and the load buffer. The LPT tracks information to compare an address of a store or snoop microoperation with entries in the LICM and re-loads a load microoperation of a matching entry.

Patent Claims
20 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. A processor comprising: a store buffer to store store instructions to be processed to store data in main memory; a load buffer to store load instructions to be processed to load data from main memory; a loop invariant code motion (LICM) protection structure coupled to the store buffer and the load buffer, the LICM protection structure to track information to compare an address of a store or snoop microoperation with entries in the LICM protection structure; and an address generation unit (AGU) coupled to the LICM protection structure, the AGU to check whether a store or snoop operation is in a read modify write (RMW) sequence, the AGU to bypass the LICM protection structure for a matching entry in a check of the LICM protection structure for a detected RMW sequence.

2

2. The processor of claim 1 , wherein the LICM protection structure includes fields for each entry of any one or more of a valid bit, LSET bit, address, post-retirement bit, load buffer identifier, and CEIP.

3

3. The processor of claim 1 , wherein the AGU is further to check, where an RMW sequence is not detected, whether the store or snoop operation has a destination address that is in one of a plurality of entries of the LICM protection structure, and to re-load a load operation of one of the plurality of entries that matches the destination address.

4

4. The processor of claim 1 , further comprising: a front end communicatively coupled to the LICM protection structure, the front end to optimize a code sequence with LICM and allocate and entry for an optimization in the LICM protection structure.

5

5. The processor of claim 1 , further comprising: a load clear structure coupled to the LICM protection structure, the load clear structure to track retirement of load clear microoperations that identify a completion of an LICM sequence.

6

6. The processor of claim 1 , wherein the store buffer and the load buffer are to perform two-way checks with the load buffer to compare retiring loads to younger stores and the store buffer to compare retiring stores with older loads to ensure memory disambiguation coherency.

7

7. The processor of claim 1 , further comprising: a binary translation front end to identify load instructions that can bypass a store buffer check for load forwarding.

8

8. A computer system comprising: a main memory; and a system on a chip coupled to the main memory, the system on a chip including a processor including a pipeline with a store buffer to store store instructions to be processed to store data in main memory, a load buffer to store load instructions to be processed to load data from main memory, a loop invariant code motion (LICM) protection structure coupled to the store buffer and the load buffer, the LICM protection structure to track information to compare an address of a store or snoop microoperation with entries in the LICM an address generation unit (AGU) coupled to the LICM protection structure the AGU to check whether a store or snoop operation is in a read modify write (RMW) sequence, the AGU to bypass the LICM protection structure for a for a matching entry in a check of the LICM protection structure for a detected RMW sequence.

9

9. The computer system of claim 8 , wherein the LICM protection structure includes fields for each entry of any one or more of a valid bit, LSET bit, address, post-retirement bit, load buffer identifier, and CEIP.

10

10. The computer system of claim 8 , wherein the AGU is further to check, where an RMW sequence is not detected, whether the store or snoop operation has a destination address that is in one of a plurality of entries of the LICM protection structure, and to re-load a load operation of one of the plurality of entries that matches the destination address.

11

11. The computer system of claim 8 , further comprising: a front end communicatively coupled to the LICM protection structure, the front end to optimize a code sequence with LICM and allocate and entry for an optimization in the LICM protection structure.

12

12. The computer system of claim 8 , further comprising: a load clear structure coupled to the LICM protection structure, the load clear structure to track retirement of load clear microoperations that identify a completion of an LICM sequence.

13

13. The computer system of claim 8 , wherein the store buffer and the load buffer are configured for two-way checks with the load buffer comparing retiring loads to younger stores and the store buffer comparing retiring stores with older loads to ensure memory disambiguation coherency.

14

14. The computer system of claim 8 , further comprising: a binary translation front end to identify load instructions that can bypass a store buffer check for load forwarding.

15

15. A non-transitory computer readable medium having stored therein a set of instructions, which when executed cause a computer to perform a set of operations comprising: storing store instructions in a store buffer to be processed to store data in main memory; storing load instructions in a load buffer to be processed to load data from main memory; identifying a loop invariant code motion (LICM) in a code sequence; updating a LICM protection structure to track information about the LICM in the code sequence; comparing an address of a store or snoop microoperation with entries in the LICM; and checking whether the store or snoop operation is in a read modify write (RMW) sequence to bypass the LICM protection structure for detected RMW sequences.

16

16. The non-transitory computer readable medium of claim 15 , further comprising: checking, where a read modify write (RMW) sequence is not detected, whether the store or snoop operation has a destination address that is in one of a plurality of entries of the loop invariant code motion (LICM) protection structure, and to re-load a load operation of one of the plurality of entries that matches the destination address.

17

17. The non-transitory computer readable medium of claim 15 , further comprising: optimizing a code sequence with LICM; and allocating an entry for the optimization in the LICM protection structure.

18

18. The non-transitory computer readable medium of claim 15 , further comprising: tracking retirement of load clear microoperations that identify a completion of an LICM sequence.

19

19. The non-transitory computer readable medium of claim 15 , further comprising: performing two-way checks by comparing retiring loads to younger stores and comparing retiring stores with older loads to ensure memory disambiguation coherency.

20

20. The non-transitory computer readable medium of claim 15 , further comprising: identifying load instructions that can bypass a store buffer check for load forwarding.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

Patent Metadata

Filing Date

December 21, 2018

Publication Date

December 1, 2020

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Method and apparatus for supporting speculative memory optimizations” (US-10853078). https://patentable.app/patents/US-10853078

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.