Causality-based memory ordering in a multiprocessing environment. A disclosed embodiment includes a plurality of processors and arbitration logic coupled to the plurality of processors. The processors and arbitration logic maintain processor consistency yet allow stores generated in a first order by any two or more of the processors to be observed consistent with a different order of stores by at least one of the other processors. Causality monitoring logic coupled to the arbitration logic monitors any causal relationships with respect to observed stores.
Legal claims defining the scope of protection, as filed with the USPTO.
1. An apparatus comprising: a plurality of processors; arbitration logic coupled to said plurality of processors, said arbitration logic and said plurality of processors to maintain processor consistency yet to allow a plurality of stores generated in a first order of stores by any two or more of said plurality of processors to be observed, consistent with a different order of stores that is inconsistent with the first order of stores, by at least one other of said plurality of processors; and causality monitoring logic coupled to said arbitration logic, the causality monitoring logic to monitor causal relationships with respect to observed stores.
2. The apparatus of claim 1 wherein said arbitration logic and said plurality of processors are to allow the plurality of stores to be observed, consistent with the different order of stores provided that causality between stores of the plurality of stores is maintained by the different order of stores allowed.
3. The apparatus of claim 1 wherein said arbitration logic is also to reorder the plurality of stores generated in the first order of stores to be observed, consistent with the different order of stores provided that causality is maintained with respect to observed stores.
4. The apparatus of claim 2 wherein the arbitration logic is to maintain causality by allowing a first sub-plurality of stores from said plurality of stores from a first sub-plurality of said plurality of processors to be observed, with respect to a second sub-plurality of other stores from a second sub-plurality of processors, consistent with the different order of stores if none of said second sub-plurality of processors have observed any of said first sub-plurality of stores.
5. The apparatus of claim 1 wherein said arbitration logic is also to ensure that stores from any one of said plurality of processors are observed in order by all of said plurality of processors.
6. The apparatus of claim 5 wherein the arbitration logic is to maintain causality by reordering a first store from a first processor to be observed, consistent with the different order of stores, with respect to a second store from a second processor only if said second processor has not observed said first store.
7. The apparatus of claim 1 wherein said arbitration logic is to ensure that loads from each processor appear to execute in order.
8. The apparatus of claim 1 wherein said arbitration logic is to ensure that loads and stores to the same address are globally ordered.
9. The apparatus of claim 1 wherein said arbitration logic is to guarantee causal relationships with respect to observed stores.
10. The apparatus of claim 5 wherein said arbitration logic is to ensure that loads from each processor appear to execute in order, to ensure that loads and stores to the same address are globally ordered, and to guarantee causal relationships with respect to observed stores.
11. The apparatus of claim 5 wherein the arbitration logic is to reorder the plurality of stores from the two or more of said plurality of processors to be observed consistent with multiple different orders of stores by different ones of said plurality of processors.
12. The apparatus of claim 5 wherein the plurality of processors are arranged in a plurality of clusters.
13. The apparatus of claim 5 wherein the arbitration logic is switch-based arbitration logic individually coupled to each of said plurality of processors and to at least one memory or another switch-based arbitration logic.
14. The apparatus of claim 1 wherein said apparatus is integrated into a single multiprocessing integrated circuit.
15. The apparatus of claim 1 further comprising store forwarding logic.
16. The apparatus of claim 15 wherein said store forwarding logic is to forward data from a first store to a first memory location from a first one of said plurality of processors to a load of the first memory location from a second one of said plurality of processors if no causal relationship exists with respect to the first store.
17. The apparatus of claim 16 wherein said store forwarding logic is to forward data from the first store from the first one of said plurality of processors to the load from the second one of said plurality of processors if a causal relationship exists with respect to the first store only if the first store is ordered before a next older store after the load which accesses the first memory location.
18. A system comprising: a plurality of processors; arbitration logic coupled to said plurality of processors, said arbitration logic and said plurality of processors to maintain processor consistency yet to allow a plurality of stores generated by any two or more of said plurality of processors to be observed by others of said plurality of processors, the observation by two or more of said others indicating different orderings of the plurality of stores; and causality monitoring logic coupled to said arbitration logic, the causality monitoring logic to monitor causal relationships with respect to observed stores.
19. The system of claim 18 wherein said arbitration logic and said plurality of processors are to allow the plurality of stores to be observed indicating different orderings of the plurality of stores, provided that causality is maintained by the different orderings of the plurality of stores allowed.
20. The system of claim 18 wherein said arbitration logic is also to reorder the plurality of stores to be observed, consistent with inconsistent different orders of stores, provided that causality is maintained with respect to observed stores.
21. A multiprocessing system comprising: a plurality of processors capable of generating a plurality of stores observable consistent with a first ordering of stores by at least one of said plurality of processors; a memory accessible to the plurality of processors through stores to said memory, said stores observable by the plurality of processors through loads from said memory; arbitration logic coupled to said plurality of processors, said arbitration logic including causality monitoring logic to identify a potential for causality between a sub-plurality of the plurality of stores and said arbitration logic to permit the sub-plurality of the plurality of stores to be observed by at least one other of said plurality of processors consistent with a second ordering of stores different from the first ordering of stores, said system to maintain a memory ordering consistency sufficient to ensure that causality is maintained with respect to observed stores.
22. The system of claim 21 wherein said system is to allow a plurality of stores by any two or more of said plurality of processors to be observed, consistent with inconsistent different orders of stores, by others of said plurality of processors provided that causality is maintained by the different orders of stores allowed.
23. The system of claim 21 wherein said system is to allow a plurality of stores generated in a first order of stores by any two or more of said plurality of processors to be observed, consistent with a different order of stores that is inconsistent with the first order of stores, by at least one other of said plurality of processors provided that causality is maintained with respect to observed stores.
24. The system of claim 23 wherein said arbitration logic is to reorder said plurality of stores generated in the first order of stores to be observed, consistent with the different order of stores.
25. The system of claim 21 wherein said arbitration logic is also to ensure that stores from any one of said plurality of processors are observed in order by all of said plurality of processors.
26. The system of claim 21 wherein said arbitration logic is to ensure that loads from each processor appear to execute in order.
27. The system of claim 21 wherein said arbitration logic is to ensure that loads and stores to the same address are globally ordered.
28. The system of claim 25 wherein said arbitration logic is to ensure that loads from each processor appear to execute in order, to ensure that loads and stores to the same address are globally ordered, and to guarantee causal relationships.
29. A system comprising: a plurality of processors; arbitration logic coupled to said plurality of processors, said arbitration logic comprising: store buffering logic to buffer stores received from at least one processor of the plurality of processors; and causality monitoring logic coupled to said store buffering logic, the causality monitoring logic to monitor causal relationships with respect to buffered stores.
30. The system of claim 29 wherein said arbitration logic further comprises store forwarding logic.
31. The system of claim 30 wherein said store forwarding logic is to forward data from a first store to a first memory location from a first one of said plurality of processors to a load of the first memory location from a second one of said plurality of processors if no causal relationship exists with respect to the first store and the second one of said plurality of processors.
32. The system of claim 31 wherein said store forwarding logic is to forward data from the first store from the first one of said plurality of processors to the load from the second one of said plurality of processors if a causal relationship exists only if the first store is ordered before a next older store after the load which accesses the first memory location.
33. The system of claim 29 wherein said arbitration logic further comprises access optimization logic to alter load and store access ordering of loads and stores received from said plurality of processors.
34. The system of claim 29 wherein said arbitration logic further comprises access optimization logic to ensure that stores from any one of said plurality of processors are observed in order by all of said plurality of processors, and to allow a plurality of stores generated in a first order of stores by any two or more of said plurality of processors to be observed, consistent with a different order of stores that is inconsistent with the first order of stores, by at least one of said plurality of processors provided that causality is not violated.
35. Arbitration logic comprising: buffering logic; access optimization logic; causality monitoring logic to monitor causal relationships with respect to observed stores; and arbitration logic to cooperate with the causality monitoring logic, said buffering logic and said access optimization logic to allow a plurality of stores generated in a first order of stores by any two or more of a plurality of processors to be observed by at least one of said plurality of processors, the observation by said at least one of the plurality of processors indicating an order of stores different from the first order of stores.
36. The arbitration logic of claim 35 wherein said arbitration logic is to ensure that stores from any one of said two or more of the plurality of processors are observed, consistent with the first order of stores by all of said two or more of the plurality of processors.
37. The arbitration logic of claim 35 wherein said arbitration logic is to ensure that loads from each processor appear to execute in order.
38. The arbitration logic of claim 35 wherein said arbitration logic is to ensure that loads and stores to the same address are globally ordered.
39. The arbitration logic of claim 36 wherein said arbitration logic is to ensure that loads from each processor appear to execute in order, to ensure that loads and stores to the same address are globally ordered, and to guarantee causal relationships with respect to observed stores.
40. A method comprising: receiving a plurality of stores generated in a first order of stores from a plurality of bus agents; transparently monitoring causal relationships for said plurality of bus agents with respect to said plurality of stores; allowing the plurality of stores to be observed by at least one other of said plurality of bus agents, said observation contradicting the first order of stores; and maintaining a processor consistency memory ordering model.
41. The method of claim 40 further comprising: ensuring that stores from any one of the plurality of bus agents are observed, consistent with the first order of said stores by all of said plurality of bus agents.
42. The method of claim 41 wherein allowing comprises: determining if causality with respect to observed stores would be violated by allowing the plurality of stores to be observed contradictory with the first order of stores; preventing a reordering of any one of the plurality of stores if the reordering would violate causality with respect to observed stores; and reordering a subset of the plurality of stores that does not violate causality with respect to observed stores to be observed contradictory with the first order of stores by at least one of said plurality of bus agents.
43. The method of claim 42 wherein determining if causality is violated comprises determining whether a store depends on a prior non-globally-observed store.
44. The method of claim 40 wherein a store is observed when a processor has loaded a memory location indicated by the store.
45. The method of claim 40 wherein a store is observed when a processor loads and actually uses a memory location indicated by the store.
46. The method of claim 42 wherein preventing comprises preventing a second store from being globally observed prior to a first store if said first store is executed by a first processor prior to the second store being executed by a second processor and if the second processor loaded a memory location indicated by the first store prior to executing the second store.
47. The method of claim 40 wherein allowing comprises reordering store transactions in order to more efficiently access memory.
48. The method of claim 42 wherein preventing comprises setting one or more ordering bits in a store buffer to indicate an ordering restriction.
49. A system comprising: a plurality of processors; causality monitoring logic to monitor causal relationships with respect to stores observed by one or more of said plurality of processors; and arbitration logic coupled to said plurality of processors, said arbitration logic comprising store forwarding logic to forward data from a first store to a first memory location from a first one of said plurality of processors to a load of the first memory location from a second one of said plurality of processors if no causal relationship exists with respect to the first store and the second one of said plurality of processors.
50. The system of claim 49 wherein said store forwarding logic is to forward data from the first store from the first one of said plurality of processors to the load from the second one of said plurality of processors if a causal relationship exists only if the first store is ordered before a next older store which accesses the first memory location after the load.
51. The system of claim 50 wherein the store forwarding logic is also to signal a causal relationship being established when data is forwarded from the first store to the first memory location from the first one of said plurality of processors to the load of the first memory location from the second one of said plurality of processors when no prior causal relationship exists.
52. The system of claim 49 wherein said arbitration logic is to allow a plurality of stores generated in a first order of stores by any two or more of said plurality of processors to be observed, consistent with a different order of stores that is inconsistent with the first order of stores, by at least one of said plurality of processors provided that causality is maintained.
53. The system of claim 52 wherein the arbitration logic comprises access optimization logic to reorder the plurality of stores generated in the first order of stores to be observed, consistent with the different order of stores.
54. An apparatus comprising: a plurality of buffers; causality monitoring logic coupled to said plurality of buffers, the causality monitoring logic to monitor causal relationships with respect to buffered stores; store forwarding logic to forward data from a first store to a first memory location from a first one of a plurality of processors to a load of the first memory location from a second one of said plurality of processors if no causal relationship exists with respect to the first store and the second one of said plurality of processors.
55. The apparatus of claim 54 wherein said store forwarding logic is to forward data from the first store from the first one of said plurality of processors to the load from the second one of said plurality of processors if a causal relationship exists only if the first store is ordered before a next older store which accesses the first memory location after the load.
56. The apparatus of claim 55 wherein the store forwarding logic is also to forward data from a second store from the first one of said plurality of processors to a second load which is also from said plurality of processors.
57. The apparatus of claim 56 wherein the store forwarding logic is also to signal a causal relationship being established when data is forwarded from the first store to the first memory location from the first one of said plurality of processors to the load of the first memory location from the second one of said plurality of processors when no prior causal relationship exists.
58. A method comprising: buffering a first store to a first memory location from a first agent; transparently monitoring loads with respect to said first store; determining whether a causal relationship exists with respect to the first store and a first load from a second agent; and forwarding data from said first store to satisfy the first load if no causal relationship exists.
59. The method of claim 58 further comprising: forwarding data from said first store to satisfy the first load if the causal relationship exists only if the first store is ordered before a next older store which accesses the first memory location after the load.
60. The method of claim 58 further comprising: forwarding data from a second store from the first one of a plurality of processors to a second load which is also from said plurality of processors.
61. The method of claim 59 further comprising: establishing a causal relationship when data is forwarded from the first store to the first memory location from the first one of said plurality of processor to the load of the first memory location from the second one of said plurality of processors when no prior causal relationship exists.
62. An apparatus comprising: a plurality of processors; arbitration logic coupled to said plurality of processors, said arbitration logic to reorder a plurality of stores from two or more of said plurality of processors to be observed by different ones of said plurality of processors, said observations by different ones of said plurality of processors being permitted to contradict in accordance with said reordering; and causality monitoring logic coupled to said arbitration logic, the causality monitoring logic to monitor causal relationships with respect to observed stores.
63. The apparatus of claim 62 further comprising store forwarding logic.
64. The apparatus of claim 63 wherein a first store of the plurality of stores is observed by a forwarding of data from the first store to a load from a second one of said plurality of processors.
65. The apparatus of claim 64 wherein said arbitration logic is also to ensure that stores from any one of said plurality of processors are observed, consistent with a first order of said stores by all of said plurality of processors.
66. The apparatus of claim 63 , the store forwarding logic to forward data from a first store to a first memory location from a first one of said plurality of processors to a load of the first memory location from a second one of said plurality of processors if no causal relationship exists with respect to the first store and the second one of said plurality of processors.
67. The apparatus of claim 66 wherein said arbitration logic is to ensure that loads and stores to the same address are globally ordered.
68. An apparatus comprising: a memory accessible to at least a first portion of a plurality of bus agents through stores to said memory, said stores observable by at least a second portion of the plurality of bus agents through loads from said memory; a first bus agent of the plurality of bus agents to generate a first store to update a first initial data value at a first address of the memory; a second bus agent of the plurality of bus agents to generate a second store to update a second initial data value at a second address of the memory; a third bus agent of the plurality of bus agents to observe a first order for the first and second stores to the memory; a fourth bus agent of the plurality of bus agents to observe a second order for the first and second stores to the memory; causality checking logic to identify a potential for causality between the first and second stores to memory when the first bus agent observes the second store to memory prior to generating the first store to memory or when the second bus agent observes the first store to memory prior to generating the second store to memory; and arbitration logic coupled with the plurality of bus agents and with the causality checking logic to ensure a memory ordering wherein the first order observed by the third bus agent is the same as the second order observed by the fourth bus agent whenever the potential for causality between the first and second stores to memory is identified, but to allow for at least one memory ordering wherein the first order observed by the third bus agent is different than the second order observed by the fourth bus agent when the potential for causality between the first and second stores to memory is not identified; a difference between the first order and the second order being indicated by an updated data value from the first store being returned to the third bus agent in response to loading data from the first address prior to the second initial data value being returned to the third bus agent in response to loading data from the second address, and an updated data value from the second store being returned to the fourth bus agent in response to loading data from the second address prior to the first initial data value being returned to the fourth bus agent in response to loading data from the first address.
69. The apparatus of claim 68 further comprising store forwarding logic coupled with the arbitration logic to forward data from the first store in response to a load from the first address by the third bus agent of the plurality of bus agents if no potential for causality between the first and second stores was identified through the first bus agent observing the second store to memory.
70. The apparatus of claim 69 wherein the store forwarding logic is coupled with the arbitration logic to forward data from the second store in response to a load from the second address by the fourth bus agent of the plurality of bus agents if no potential for causality between the first and second stores was identified through the second bus agent observing the first store to memory.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
December 29, 1999
January 20, 2004
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.