A method is provided for sending and receiving data between a first processor including a first cache memory and a second processor including a second cache memory via a shared memory. The method includes classifying, by the first processor, a transfer data area that stores data transferred between the first and second processors in the shared memory as a first area filling one cache line and a second area not filling one cache line, copying, by the first processor, data in the second area into a divided data area in the shared memory, the divided data area being aligned with a cache line in the first cache memory, and processing, by the second processor, the data in the first area and the data in the divided data area as data from the first processor.
Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A method for sending and receiving data between a first processor including a first cache memory and a second processor including a second cache memory via a shared memory, the first and second cache memories having a plurality of cache lines having a fixed length, the method comprising: classifying, by the first processor, a transfer data area that stores data transferred between the first and second processors in the shared memory, as a first area filling a whole of one cache line in the first cache memory and a second area not filling one cache line in the first cache memory; copying, by the first processor, data in the second area into a divided data area in the shared memory, the divided data area being aligned with a cache line in the first cache memory; and processing, by the second processor, while treating data in the divided data area as data in the second area, data in the first area and data in the divided data area as data from the first processor.
In a multi-processor system with shared memory, where processors have caches with fixed-size cache lines, a method optimizes data transfer. The sending processor classifies the data to be transferred as either fitting entirely within one cache line or not. If the data doesn't fit within one cache line, the sending processor copies the overflow data into a separate, aligned "divided data area" in shared memory. The receiving processor then treats both the single cache line data and the divided data area as a single block of data sent from the first processor. This allows optimized processing, even if the entire dataset does not align to cache line boundaries.
2. The method according to claim 1 , further comprising: analyzing, by the first processor, cache lines of the first cache memory and the shared memory, and classifying the transfer data area as the first area and the second area.
The data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is placed in a separate aligned area) further includes the sending processor analyzing the cache lines of its own cache and the shared memory to classify the transfer data area as either fitting entirely within one cache line or not fitting within one cache line. This allows for dynamic adjustment based on memory layout.
3. The method according to claim 1 , further comprising: writing, by the first processor, information for identifying the first area and the divided data area into a predetermined area in the shared memory; and invalidating, by the second processor, data in an area of the second cache memory corresponding to data in the first area and data in the divided data area based on the information.
The data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is placed in a separate aligned area) further involves the sending processor writing information about the location of the single cache line data and the divided data area into a predetermined area of shared memory. The receiving processor uses this information to invalidate corresponding data in its own cache before processing the received data, ensuring cache coherence.
4. The method according to claim 3 , wherein the predetermined area is a non-cache area in the shared memory.
In the data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is placed in a separate aligned area, and location information is written to a predetermined shared memory area), the predetermined shared memory area where location information is written is a non-cacheable area, preventing cache interference.
5. The method according to claim 1 , wherein the second area is at least one of an area including a head of the transfer data area and an area including an end of the transfer data area.
In the data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is placed in a separate aligned area), the data that does not fit within a single cache line includes either the beginning part of the data to be transferred, the ending part of the data to be transferred, or both.
6. The method according to claim 1 , further comprising: evicting, by the first processor, data in an area in the first cache memory corresponding to the first area and the divided data area in the shared memory to the shared memory, after copying data in the second area to the divided data area.
The data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is placed in a separate aligned area) further includes the sending processor evicting the data corresponding to the single cache line data and the divided data area from its own cache to the shared memory after copying the overflow data to the divided data area. This guarantees the receiving processor reads the latest data from shared memory.
7. The method according to claim 1 , further comprising: reading, by the second processor, the shared memory after receiving a notification of completion of transfer preparation from the first processor.
The data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is placed in a separate aligned area) further includes the receiving processor reading the shared memory only after receiving a notification from the sending processor that data transfer preparation is complete. This ensures data consistency.
8. The method according to claim 7 , wherein the notification is performed by sending an interrupt signal to the second processor.
In the data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is placed in a separate aligned area, and the receiver reads shared memory after notification), the notification of completion of transfer preparation is performed by sending an interrupt signal to the receiving processor.
9. The method according to claim 7 , wherein the notification is performed by changing, by the first processor, a value in a register to a predetermined value, and detecting, by the second processor, the changed value, wherein the register is capable of being written from the first processor and referred to from the second processor.
In the data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is placed in a separate aligned area, and the receiver reads shared memory after notification), the notification is performed by the sending processor changing a value in a register accessible by both processors to a specific predetermined value, which is then detected by the receiving processor.
10. The method according to claim 7 , wherein the notification is performed by changing a value in a specific area in the shared memory to a predetermined value by the first processor, and detecting the changed value by the second processor.
In the data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is placed in a separate aligned area, and the receiver reads shared memory after notification), the notification is performed by the sending processor changing a value in a specific area of the shared memory to a predetermined value, and the receiving processor detects this changed value.
11. A method for sending and receiving data between a first processor including a first cache memory and a second processor including a second cache memory via a shared memory, the first and second cache memories having a plurality of cache lines having a fixed length, the method comprising: classifying, by the first processor, a transfer data area that stores data transferred between the first and second processors in the shared memory, as a first area filling a whole of one cache line in the first cache memory and a second area not filling one cache line in the first cache memory; securing, by the first processor, a divided data area aligned with a cache line in the first cache memory in the shared memory; writing, by the second processor, data transferred between the first and the second processors into the first area in the shared memory and the divided data area in the shared memory; and processing, by the first processor, data written into the first area and the divided data area in the shared memory as data from the second processor.
In a multi-processor system with shared memory, where processors have caches with fixed-size cache lines, a method optimizes data transfer. The sending processor classifies the data to be transferred as either fitting entirely within one cache line or not. If the data doesn't fit within one cache line, the sending processor reserves a separate, aligned "divided data area" in shared memory. The receiving processor writes the data that fits within a cache line directly into shared memory. It writes the remainder, that doesn't fit within a cache line, into the reserved aligned "divided data area." The first processor then accesses data from both locations as if it all came from the second processor.
12. The method according to claim 11 , further comprising: copying data read by the first processor in the divided data area in the shared memory into an area corresponding to the second area in the shared memory; and processing data in the first area and the second area in the shared memory as data from the second processor.
The data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is written into a separate aligned area) further includes the sending processor, after reading data from the divided data area, copying that data into an area corresponding to the "not fitting" part of the data in the shared memory. Then, the first processor processes the data in the single cache line area and the area corresponding to the "not fitting" part of the original data, as the final data from the second processor.
13. The method according to claim 11 , further comprising: analyzing, by the first processor, cache lines of the first cache memory and the shared memory, and classifying the transfer data area into the first area and the second area.
The data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is written into a separate aligned area) further includes the sending processor analyzing the cache lines of its own cache and the shared memory to classify the transfer data area as either fitting entirely within one cache line or not fitting within one cache line. This classification drives the choice of writing directly or writing to a separate aligned area.
14. The method according to claim 11 , further comprising: evicting or invalidating, by the first processor, data in an area of the first cache memory corresponding to the first area and the divided data area in the shared memory to the shared memory after securing the divided data area.
The data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is written into a separate aligned area) further includes the sending processor evicting or invalidating data in its own cache that corresponds to the single cache line data and the divided data area in the shared memory after securing the divided data area, thus ensuring cache coherence.
15. The method according to claim 11 , further comprising: writing, by the first processor, information for identifying the first area and the divided data area into a predetermined area in the shared memory; and writing, by the second processor, data into the shared memory based on the information.
The data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is written into a separate aligned area) further includes the sending processor writing information about the location of the single cache line data and the divided data area into a predetermined area of shared memory, and then the receiving processor uses this information to write data into the correct locations in shared memory.
16. The method according to claim 15 , further comprising: invalidating, by the second processor, data in an area of the second cache memory corresponding to the first area and the divided data area in the shared memory before writing the data into the shared memory.
The data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is written into a separate aligned area, and location information is written to a predetermined shared memory area), also includes the receiving processor invalidating data in its own cache corresponding to the single cache line data and the divided data area in the shared memory before writing data into shared memory.
17. The method according to claim 15 , wherein the predetermined area is a non-cache area in the shared memory.
In the data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is written into a separate aligned area, and location information is written to a predetermined shared memory area), the predetermined shared memory area where location information is written is a non-cacheable area, preventing cache interference.
18. The method according to claim 11 , wherein the second area is at least one of an area including a head of the transfer data area and an area including an end of the transfer data area.
In the data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is written into a separate aligned area), the data that does not fit within a single cache line includes either the beginning part of the data to be transferred, the ending part of the data to be transferred, or both.
19. The method according to claim 11 , further comprising: evicting, by the second processor, data in an area of the second cache memory corresponding to the first area and the divided data area in the shared memory to the shared memory, after writing data into the first area and the divided data area.
The data transfer optimization method (where data is classified as fitting a cache line or not, and overflow is written into a separate aligned area) further includes the receiving processor evicting data in its own cache corresponding to the single cache line data and the divided data area in the shared memory to the shared memory, after writing data into the single cache line area and divided data area, thus ensuring data consistency for other processors.
20. A multiprocessor system configured to send and receive data between a first processor including a first cache memory and a second processor including a second cache memory via a shared memory, the first and second cache memories having a plurality of cache lines having a fixed length, the system comprising: the first processor comprising: a classification unit configured to classify a transfer data area that stores data transferred between the first processor and the second processor in the shared memory, as a first area filling a whole of one cache line in the first cache memory and a second area not filling one cache line in the first cache memory; and a copying unit configured to copy data in the second area into a divided data area in the shared memory, the divided data area being aligned with a cache line in the first cache memory, and the second processor comprising: a processing unit configured to process, while treating data in the divided data area as data in the second area, data in the first area and data in the divided data area as data from the first processor.
A multiprocessor system that efficiently transfers data via shared memory between two processors, each having a cache with fixed-size cache lines. The first processor includes a classifier that determines if the data fits into a single cache line. If not, it uses a copying unit to put the extra data into a "divided data area" in shared memory, aligned to a cache line. The second processor then uses a processing unit that combines the data from the single cache line data and the "divided data area," effectively treating it as if it were all one block of contiguous data from the first processor.
21. A multiprocessor system configured to send and receive data between a first processor including a first cache memory and a second processor including a second cache memory via a shared memory, the first and second cache memories having a plurality of cache lines having a fixed length, the system comprising: the first processor comprising: a classification unit configured to classify a transfer data area that stores data transferred between the first processor and the second processor in the shared memory, as a first area filling a whole of one cache line in the first cache memory and a second area not filling one cache line in the first cache memory; and a securing unit configured to secure a divided data area in the shared memory, the divided data area being aligned with a cache line in the second area, and the second processor comprising: a writing unit configured to write data transferred between the first processor and the second processor into the first area in the shared memory and the divided data area in the shared memory, wherein the first processor processes the data written into the first area and the divided data area in the shared memory as data from the second processor.
A multiprocessor system that efficiently transfers data via shared memory between two processors, each having a cache with fixed-size cache lines. The first processor includes a classifier that determines if the data fits into a single cache line. If not, it uses a securing unit to reserve a "divided data area" in shared memory, aligned to a cache line. The second processor includes a writing unit to write data into the single cache line area and the "divided data area". The first processor then processes the combined data as if it all came from the second processor.
22. A first processor in a multiprocessor system configured to send and receive data between the first processor including a first cache memory and a second processor including a second cache memory via a shared memory, the first and second cache memories having a plurality of cache lines having a fixed length, the first processor comprising: a classification unit configured to classify a transfer data area that stores data transferred between the first processor and the second processor in the shared memory, as a first area filling a whole of one cache line in the first cache memory and a second area not filling one cache line in the first cache memory; and a copying unit configured to copy data in the second area into a divided data area in the shared memory, the divided data area being aligned with a cache line in the first cache memory.
A processor for a multiprocessor system that efficiently transfers data to another processor via shared memory, both processors having caches with fixed-size cache lines. This processor includes a classifier that determines if the data to be sent fits into a single cache line. If not, it uses a copying unit to put the extra data into a "divided data area" in shared memory, aligned to a cache line. This processor only handles the sending side of the optimized shared memory transfer.
23. A second processor in a multiprocessor system configured to send and receive data between a first processor including a first cache memory and the second processor including a second cache memory via a shared memory, the first and second cache memories having a plurality of cache lines having a fixed length, wherein the first processor classifies transfer data area that stores the data transferred between the first processor and the second processor in the shared memory, as a first area filling a whole of one cache line in the first cache memory and a second area not filling one cache line in the first cache memory, and secures a divided data area in the shared memory, the divided data area being aligned with a cache line in the first cache memory, the second processor comprising: a writing unit configured to write data transferred between the first processor and the second processor into the first area in the shared memory and the divided data area in the shared memory; and a notification unit configured to notify for causing the first processor to read data written in the first area and the divided data area in the shared memory, after writing the data by the writing unit.
A processor for a multiprocessor system that efficiently receives data from another processor via shared memory, both processors having caches with fixed-size cache lines. The sending processor classifies the data as either fitting within one cache line or not, and if not fitting, secures a divided data area in the shared memory that is aligned with a cache line. The receiving processor includes a writing unit to write data into the single cache line area and the divided data area. After writing the data, a notification unit informs the sending processor to read the data from those locations.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
October 12, 2009
August 6, 2013
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.