Patentable/Patents/US-7340567
US-7340567

Value prediction for missing read operations instances

PublishedMarch 4, 2008
Assigneenot available in USPTO data we have
Inventorsnot available in USPTO data we have
Technical Abstract

Typically, missing read operations instances account for a small fraction of the operations instances of an application, but for nearly all of the performance degradation due to access latency. Hence, a small predictor structure maintains sufficient information for performing value prediction for the small fraction of operations (the missing instances of read operations) that account for nearly all of the access latency performance degradation. With such a small predictor structure, a processor value predicts for selective instances of read operations, those selective instances being read operations that are unavailable in a first memory (e.g., those instances of read operations that miss in L2 cache). Respective actual values for prior missing instances of the read operations are stored and used for value predictions of respective subsequent instances of the read operations. The value predictions are, at least partially, based on accuracy of value predictions for prior corresponding missing instances of the read operations.

Patent Claims
97 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1

1. A processor that value predicts for selective instances of read operations, the processor value predicting for those instances of read operations with values that are unavailable in a low-latency memory and requested from a high-latency memory while ignoring those instances of read operations unavailable in the low-latency memory requested from a second low-latency memory, and that at least partially bases the selective value predictions on accuracy of value predictions for prior corresponding missing instances of the read operations.

2

2. The processor of claim 1 , wherein the low-latency memory includes one or more of L1 cache and L2 cache.

3

3. The processor of claim 1 , wherein the high-latency memory includes one or more of L3 cache, random access memory, flash memory, erasable programmable memory, and read-only memory.

4

4. The processor of claim 1 , wherein the processor includes a memory operations unit that comprises a missing read operation value prediction structure used by the processor to value predict for the missing instances of read operations.

5

5. The processor of claim 4 , wherein the memory operations unit includes a load store queue or a memory disambiguation buffer.

6

6. The processor of claim 4 , wherein the missing read operation value prediction structure includes entries to host indications of read operations, predicted values, and value prediction qualifiers.

7

7. The processor of claim 6 , wherein the value prediction qualifiers reflect the accuracy of prior value predictions.

8

8. The processor of claim 6 , wherein the value prediction qualifiers include one or more of confidence values and strength values.

9

9. The processor of claim 1 , wherein the processor causes invocation of a value prediction state machine to value predict for missing instances of read operations, said invocation being coincident with detection of a read operation instance missing in the low-latency memory.

10

10. The processor of claim 9 , wherein the value prediction state machine accesses a missing read operations value prediction encoding to value predict, wherein the low-latency memory hosts at least a part of the missing read operations value prediction encoding.

11

11. The processor of claim 10 , wherein said value prediction state machine accesses the missing read operations value prediction encoding with an address constructed from a base register and from read operation identifying information.

12

12. The processor of claim 9 , wherein the processor causes traps to be issued coincident with arrival of actual values for missing instances of read operations, and a trap handler updates the missing read operations value prediction encoding in accordance with the traps, wherein the issued traps indicate the actual values.

13

13. The processor of claim 9 , wherein the value prediction state machine updates the missing read operations value prediction encoding coincident with arrival of actual values for missing instances of read operations indicated in the missing read operations value prediction encoding.

14

14. The processor of claim 1 , wherein the processor causes traps to be issued coincident with detection of read operation instances missing in the low-latency memory, and a trap handler accesses a missing read operations value prediction encoding to value predict in accordance with the issued traps, wherein the traps at least indicate the missing instances of the read operations.

15

15. The processor of claim 14 , further comprising the trap handler to construct an address to access the missing read operations value prediction encoding.

16

16. The processor of claim 15 , wherein the address is constructed from a base address and read operation identifying information.

17

17. The processor of claim 14 , wherein the low-latency memory hosts at least a part of the missing read operations value prediction encoding.

18

18. The processor of claim 14 , wherein the processor causes traps to be issued coincident with arrival of actual values for value predicted missing instances of read operations, and the trap handler updates the missing read operations value prediction encoding accordingly, wherein the traps coincident with actual value arrivals at least indicate the actual values.

19

19. The processor of claim 1 further comprising the processor preserving execution state.

20

20. The processor of claim 19 , wherein said preserving execution state comprises checkpointing register state prior to speculative execution of missing instances of read operations with predicted values, and returning to the checkpointed register state for those value predicted missing instances of read operations determined to be mis-predictions.

21

21. The processor of claim 19 , wherein preserving execution state comprises the processor buffering results of instances of operations that are dependent on value predicted instances of read operations, and causing those buffered results that correspond to verified value predicted instances of read operations to be committed.

22

22. The processor of claim 21 , wherein the dependent instances of operations are instances of write operations.

23

23. The processor of claim 21 , wherein the processor includes a first store to host instances of write operations and a second store to host the buffered results of those instances of the write operations that are dependent on missing instances of read operations.

24

24. A method comprising: detecting a first instance of a read operation missing in a low-latency memory and requested from a high-latency memory while ignoring the read operation missing in the low-latency memory requested from a second low-latency memory; indicating the read operation; indicating an actual value for the first instance of the read operation, wherein the actual value is from the high-latency memory; detecting a subsequent instance of the read operation missing in the low-latency memory; and supplying the indicated value for the read operation's subsequent instance, wherein the indicated value is supplied based, at least in part, on the first and subsequent instances missing in the low-latency memory.

25

25. The method of claim 24 , wherein the low-latency memory includes one or more of L1 cache and L2 cache.

26

26. The method of claim 24 , wherein the high-latency memory includes one or more of L3 cache, random access memory, flash memory, erasable programmable memory, or read-only memory.

27

27. The method of claim 24 further comprising: preserving execution state; and recovering execution state if the supplied indicated value does not match an actual value determined for the read operation's subsequent instance.

28

28. The method of claim 27 , wherein preserving execution state comprises checkpointing register state.

29

29. The method of claim 27 , wherein preserving execution state comprises buffering results of instances of operations dependent on the subsequent instance of the read operation.

30

30. The method of claim 29 further comprising committing the buffered results if the supplied indicated value matches the determined actual value.

31

31. The method of claim 24 , wherein the read operation and the actual value for the first instance of the read operation are indicated in a missing read operations value prediction encoding.

32

32. The method of claim 31 , wherein the missing read operations value prediction encoding also indicates a value prediction qualifier, and said supplying of the first instance's actual value is in accordance with the value prediction qualifier.

33

33. The method of claim 32 further comprising: determining if the actual value of the subsequent instance of the read operation matches the supplied indicated value; if the subsequent instance's actual value and the supplied value match, committing the subsequent instance of the read operation; and updating the value prediction qualifier in accordance with said determining.

34

34. The method of claim 33 , wherein said updating the value prediction qualifier comprises increasing the value prediction qualifier if the subsequent instance's actual value is determined to match the supplied value and decreasing the value prediction qualifier if the subsequent instance's actual value is determined not to match the supplied value.

35

35. The method of claim 24 embodied as a computer program product encoded on one or more machine-readable media.

36

36. A method comprising: recording actual values of prior instances of read operations that miss in a low-latency memory and are requested from a high-latency memory but not prior instances of read operations that miss in the low latency memory requested from a second low-latency memory; and supplying the actual values of the prior instances of the read operations to respective ones of subsequent instances of the read operations as predicted values, wherein the subsequent instances of the read operations also miss in the low-latency memory.

37

37. The method of claim 36 , wherein the low-latency memory includes one or more of L1 cache and L2 cache.

38

38. The method of claim 36 further comprising: preserving execution state prior to speculative execution of the prior read operations instances with the supplied values; and recovering execution state when predicted values are determined to be mis-predicted values.

39

39. The method of claim 38 , wherein preserving execution state comprises buffering results of instances of operations that are dependent on value predicted instances of the read operations, at least until the predicted values are verified.

40

40. The method of claim 38 , wherein preserving execution state comprises checkpointing register state.

41

41. The method of claim 36 further comprising verifying predicted values as accurately predicted values.

42

42. The method of claim 36 , wherein the actual values are recorded in a missing read operations value prediction encoding.

43

43. The method of claim 42 further comprising indicating the read operations in the missing read operations value prediction encoding.

44

44. The method of claim 43 further comprising indicating, in the missing read operations value prediction encoding, value prediction qualifiers for the indicated read operations.

45

45. The method of claim 44 , wherein the value prediction qualifiers include one or more of confidence values and strength values.

46

46. The method of claim 43 , wherein the read operations are indicated with one or more of their program counter high order bits, program counter low order bits, program counter, and history information.

47

47. The method of claim 46 , wherein one or more of the read operations indicators are hashed individually or together.

48

48. A processing unit comprising: a memory including a low-latency memory and a high latency memory; a missing read operations value prediction encoding to host predicted values for instances of read operations that miss in the low latency memory and are requested from the high-latency memory but not instances of read operations that miss in the low-latency memory requested from a second low-latency memory; and a memory operations unit coupled with the memory, the memory operations unit including, a missing read operation detection logic to detect instances of read operations that miss in the low-latency memory and to indicate those instances of read operations that miss in the low-latency memory and are requested from the high-latency memory.

49

49. The processing unit of claim 48 , wherein the memory operations unit further comprises a missing read operation value predictor logic to utilize the missing read operations value prediction encoding to value predict for read operations instances indicated by the missing read operation detection logic.

50

50. The processing unit of claim 48 , wherein the low-latency memory includes one or more of L1 cache and L2 cache.

51

51. The processing unit of claim 48 , wherein the memory operations unit includes a hardware structure to host the missing read operations value prediction encoding.

52

52. The processing unit of claim 48 , wherein a first region of the low-latency memory is marked during initialization of the processing unit to host the missing read operations value prediction encoding.

53

53. The processing unit of claim 52 , wherein, during initialization of the processing unit, the marked region of the low-latency memory is marked to prevent entries of the missing read operations value prediction encoding from migrating to a second region of the low-latency memory.

54

54. The processing unit of claim 53 , wherein the second region of the low-latency memory includes L1 cache.

55

55. The processing unit of claim 53 further comprising the missing read operation detection logic to cause issuance of first traps coincident with the missing read operation detection logic detecting a read operation instance missing in the memory, wherein the traps are issued to a trap handler.

56

56. The processing unit of claim 55 , wherein the trap handler supplies predicted values to the memory operations unit for missing instances of read operations from the missing read operations value prediction encoding.

57

57. The processing unit of claim 56 further comprising the processing unit to generate second traps coincident with arrival of actual values for respective missing instances of read operations, wherein the trap handler updates the missing read operations value prediction encoding in accordance with the second traps.

58

58. The processing unit of claim 52 , wherein the processing unit invokes a value predictor finite state machine that performs value prediction with the missing read operations value prediction encoding.

59

59. The processing unit of claim 48 , wherein the memory operations unit includes a load store queue or a memory disambiguation buffer.

60

60. The processing unit of claim 48 further comprising an operations retirement unit coupled with the memory operations unit, the operations retirement unit to cause installation of entries in the missing read operations value prediction encoding for missing instances of read operations.

61

61. An apparatus comprising: a memory including low-latency memory and high-latency memory; and means for detecting and indicating read operations instances that miss in the low-latency memory and are requested from the high-latency memory, the means ignoring those instances of read operations that miss in the low-latency memory requested from a second low-latency memory, and value predicting for respective ones of subsequent instances of the read operations that also miss in the low-latency memory and are requested from the high-latency memory.

62

62. The apparatus of claim 61 , wherein the low-latency memory includes one or more of L1 cache and L2 cache.

63

63. The apparatus of claim 61 further comprising means for preserving and recovering execution state in accordance with whether value prediction are accurate.

64

64. A system comprising: a low-latency memory and a high-latency memory; and a processing unit coupled with the high-latency memory, the processing unit including, a missing read operations value prediction encoding to host predicted values for instances of read operations that miss in the low-latency memory and are requested from the high-latency memory but not instances of read operations that miss in the low-latency memory requested from a second low-latency memory, a missing read operation detection unit coupled with the low-latency memory, the missing read operations detection unit to indicate instances of read operations that miss in the low-latency memory and are requested from the high-latency memory.

65

65. The system of claim 64 , wherein the processing unit includes the low-latency memory.

66

66. The system of claim 64 , wherein the processing unit includes a memory operations unit that includes the missing read operation detection unit.

67

67. The system of claim 66 , wherein the memory operations unit includes a load store queue or a memory disambiguation buffer.

68

68. The system of claim 66 , wherein the memory operations unit includes a structure to host the missing read operations value prediction encoding.

69

69. The system of claim 68 , wherein the structure includes a content addressable memory.

70

70. The system of claim 64 , wherein the high-latency memory includes one or more of L3 cache, random access memory, flash memory, erasable programmable memory, and read-only memory.

71

71. The system of claim 64 , wherein the processing unit includes a missing read operation value predictor logic to utilize the missing read operations value prediction encoding to value predict for read operations instances indicated by the missing read operation detection logic.

72

72. The system of claim 64 , wherein the missing read operations value prediction encoding is instantiable in the low-latency memory.

73

73. The system of claim 72 , wherein the missing read operations encoding can be shared among multiple cores of the processing unit.

74

74. The system of claim 73 , wherein a second missing read operations encoding can also be shared among the multiple cores, and wherein the missing read operations encoding and the second missing read operations encoding are instantiated for different applications.

75

75. The system of claim 72 , wherein the low-latency memory includes L2 cache and L1 cache, wherein the missing read operation value predictor is instantiable in the L2 cache, but entries thereof are prevented from migrating to the L1 cache.

76

76. The system of claim 64 further comprising a bus that couples the high-latency memory with the processing unit.

77

77. The system of claim 64 , wherein the processing unit includes a first store to host instances of write operations dependent on value predicted instances of read operations and a second store to host results of the instances of dependent write operations at least until verification of corresponding value predictions.

78

78. An article of manufacture comprising a computer program product encoded in one or more machine-readable media, the computer program product comprising: a first sequence of instructions executable to select an entry in a missing read operations value prediction encoding that corresponds to a read operation instance that misses in a low-latency memory and is requested from a high-latency memory but not a read operation instance that misses in the low latency memory requested from a second low latency memory, and to supply a predicted value indicated in the selected entry for the missing read operation instance, wherein the selection is coincident with detection of the read operation instance missing in the low-latency memory; and a second sequence of instructions executable to update the missing read operations value prediction encoding to reflect accuracy of value predictions for missing instances of read operations.

79

79. The article of manufacture of claim 78 , wherein the missing read operations value prediction encoding is instantiable in the low-latency memory.

80

80. The article of manufacture of claim 79 , further comprising a third sequence of instructions executable to prevent migration of the entries of the missing read operations value prediction encoding from the low-latency memory to a second memory.

81

81. The article of manufacture of claim 80 , wherein the low-latency memory includes L2 cache and the second memory includes L1 cache.

82

82. The article of manufacture of claim 78 , further comprising trap handler code that includes the first and second sequences of instructions, wherein a first set of the traps at least indicate missing instances of read operations, and a second set of the traps at least indicate missing instances of read operations and actual values thereof.

83

83. The article of manufacture of claim 78 further comprising a value predictor finite state machine code that includes the first sequence of instructions.

84

84. The article of manufacture of claim 83 , wherein the value predictor finite state machine code receives indications of missing instances of read operations and constructs addresses therefrom to access the missing read operations value prediction encoding.

85

85. The article of manufacture of claim 84 , wherein the value predictor finite state machine contains a base register also used in constructing the addresses.

86

86. The article of manufacture of claim 84 further comprising trap handler code that includes the second sequence of instructions, wherein traps handled by the trap handler at least indicate missing instances of read operations.

87

87. The article of manufacture of claim 86 , wherein the trap handler accesses the missing read operations value prediction encoding with addresses constructed from a base address and read operations identifying information.

88

88. The article of manufacture of claim 78 , wherein the first sequence of instructions to select the entry in the missing read operations value prediction encoding comprises the first sequence of instructions to, access the entry in the missing read operations value prediction encoding with an index; and determine that the accessed entry corresponds to the missing read operation instance.

89

89. The article of manufacture of claim 88 , wherein the index is at least a part of the read operations static identifier.

90

90. The article of manufacture of claim 89 , wherein the static identifier includes the read operation's program counter.

91

91. The article of manufacture of claim 88 further comprising constructing the index.

92

92. The article of manufacture of claim 91 , wherein said constructing the index comprises hashing the read operation's static identifier.

93

93. The article of manufacture of claim 91 , wherein said constructing the index comprises hashing the read operation's static identifier with history of the read operation.

94

94. An article of manufacture comprising a missing read operation value prediction structure encoded in one or more machine-readable media, the missing read operation value prediction structure comprising: an index field to indicate an index; a missing read operation field to indicate a read operation that misses in a low-latency memory and is requested from a high-latency memory but not a read operation that misses in the low-latency memory requested from a second low-latency memory; a predicted value field to indicate a predicted value for instances of a read operation indicated in the missing read operation field; and a value prediction qualifier field to indicate a value prediction qualifier, wherein a predicted value indicated in the predicted value field is supplied to an instance of a read operation indicated in the missing read operation field in accordance with a value prediction qualifier indicated in the value prediction qualifier field.

95

95. The article of manufacture of claim 94 , wherein the index includes one or more of a program counter, low-order bits of a program counter, high-order bits of a program counter, and history information.

96

96. The article of manufacture of claim 94 , wherein the indication of the read operation includes one or more of at least a part of a program counter of the read operation, history information of the program counter, a hash of at least a part of the program counter, a hash of the history information, and a hash of the program counter and the history information.

97

97. The article of manufacture of claim 94 , wherein the value prediction qualifier includes one or more of confidence, strength, and counter.

Classification Codes (CPC)

Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.

Patent Metadata

Filing Date

April 14, 2004

Publication Date

March 4, 2008

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Citation & reuse

Analysis on this page is generated by Patentable — an AI-powered patent intelligence platform. AI-generated summaries, explanations, and analysis may be reused with attribution and a visible link back to the canonical URL below. Patent abstracts and claims are USPTO public domain.

Cite as: Patentable. “Value prediction for missing read operations instances” (US-7340567). https://patentable.app/patents/US-7340567

© 2026 Patentable. All rights reserved.

Patentable is a research and drafting-assistant tool, not a law firm, and does not provide legal advice. Documents we generate are drafts for review by a licensed patent attorney.