Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. An integrated circuit for a coherent data processing system including a system memory, the integrated circuit comprising: a first communication interface for communicatively coupling the integrated circuit with the coherent data processing system; a second communication interface for communicatively coupling the integrated circuit with an accelerator unit including an effective address-based accelerator cache for buffering copies of data from the system memory of the coherent data processing system; a set-associative real address-based directory inclusive of contents of the accelerator cache, wherein the real address-based directory assigns entries based on real addresses utilized to identify storage locations in the system memory; and directory control logic that configures at least a number of congruence classes utilized in the real address-based directory based on configuration parameters provided by or on behalf of the accelerator unit.
This invention relates to an integrated circuit designed for a coherent data processing system with a system memory. The system addresses the challenge of efficiently managing data coherence between a host processor and an accelerator unit, which includes an effective address-based accelerator cache. The accelerator cache stores copies of data from the system memory, but traditional coherence mechanisms may not efficiently track these copies due to differences in addressing schemes. The integrated circuit includes a first communication interface to connect with the coherent data processing system and a second communication interface to connect with the accelerator unit. A set-associative real address-based directory is used to track the contents of the accelerator cache, where entries are assigned based on real addresses that identify storage locations in the system memory. This directory ensures coherence by maintaining a record of cached data copies. The directory control logic dynamically configures the number of congruence classes in the directory based on configuration parameters provided by or on behalf of the accelerator unit. This adaptability allows the system to optimize performance and resource usage depending on the workload or specific requirements of the accelerator. The real address-based directory is inclusive of the accelerator cache, meaning it contains all the data entries present in the accelerator cache, ensuring accurate coherence tracking. The dynamic configuration of congruence classes enables the system to balance between directory size and lookup efficiency, improving overall system performance.
2. The integrated circuit of claim 1 , wherein the directory control logic further configures a number of ways per congruence class utilized in the real address-based directory based on the configuration parameters.
The invention relates to integrated circuits with configurable directory control logic for managing memory access in a multi-core or multi-threaded system. The problem addressed is the inefficiency of fixed directory structures in handling varying workloads, leading to suboptimal performance and resource utilization. The integrated circuit includes a real address-based directory that tracks memory access permissions and coherence states. The directory control logic dynamically adjusts the number of ways (or entries) per congruence class in the directory based on configuration parameters. This allows the system to optimize directory size and performance by allocating more ways to frequently accessed memory regions while reducing overhead for less active regions. The configuration parameters may include workload characteristics, memory access patterns, or system performance metrics, enabling adaptive tuning of the directory structure. This flexibility improves cache coherence efficiency, reduces power consumption, and enhances overall system performance by dynamically balancing directory resources according to real-time demands. The invention is particularly useful in high-performance computing, data centers, and embedded systems where memory access patterns vary significantly.
3. The integrated circuit of claim 2 , wherein: the directory control logic is configured to first attempt to configure a desired number of ways in the real address-based directory and then to attempt to configure a desired number of congruence classes.
This invention relates to integrated circuits with memory systems, specifically addressing the configuration of directory structures in cache memory to optimize performance and resource allocation. The problem solved involves efficiently managing directory configurations to balance the number of ways (parallel data paths) and congruence classes (address mapping groups) in a real address-based directory, ensuring optimal cache performance while minimizing hardware complexity. The integrated circuit includes a cache memory with a real address-based directory, which maps physical memory addresses to cache locations. The directory control logic dynamically configures the directory by first attempting to allocate a desired number of ways, which determines the parallelism in data access. If the desired number of ways cannot be fully allocated, the logic then attempts to configure the desired number of congruence classes, which defines the address mapping granularity. This two-step approach ensures that the directory is optimized for both data access efficiency and address mapping flexibility, adapting to varying workload demands without requiring static pre-configuration. The system dynamically adjusts the directory structure based on runtime conditions, improving cache hit rates and reducing conflicts. By prioritizing way allocation before congruence class configuration, the logic ensures that the most critical performance factor (parallel access) is addressed first, followed by address mapping optimization. This adaptive configuration enhances overall system efficiency in integrated circuits with complex memory hierarchies.
4. The integrated circuit of claim 3 , wherein the directory control logic, based on the configuration parameters specifying directory dimensions greater than a maximum physical size of the real address-based directory, configures the real address-based directory to use the maximum physical size.
The invention relates to integrated circuits with directory control logic for managing a real address-based directory. The problem addressed is the need to handle configuration parameters that specify directory dimensions exceeding the physical size limitations of the hardware. The integrated circuit includes a real address-based directory and directory control logic that processes configuration parameters to determine the directory's dimensions. When the configuration parameters specify dimensions larger than the maximum physical size supported by the real address-based directory, the directory control logic automatically adjusts the directory to use the maximum physical size instead. This ensures the directory operates within hardware constraints while maintaining functionality. The directory control logic may also handle other configuration parameters, such as those defining the number of entries or the addressing scheme, to optimize directory performance. The real address-based directory is used for tracking memory accesses, caching, or other address-mapping functions, and the control logic ensures compatibility with the hardware's physical limits. This solution prevents errors or malfunctions that could occur if the directory were configured beyond its physical capabilities.
5. The integrated circuit of claim 1 , wherein the directory control logic is configured to iteratively attempt a best fit of desired directory dimensions indicated by the configuration parameters within a maximum physical size of the real address-based directory.
This invention relates to integrated circuits with configurable directory structures for memory management, addressing the challenge of efficiently mapping virtual addresses to physical memory locations while optimizing directory size and performance. The integrated circuit includes a real address-based directory with configurable dimensions, controlled by directory control logic that adjusts the directory's structure based on configuration parameters. These parameters define desired directory dimensions, such as the number of entries or the size of each entry, to balance memory usage and lookup efficiency. The directory control logic iteratively attempts to fit the desired dimensions within the maximum physical size constraints of the directory, ensuring optimal allocation of resources. This iterative process involves dynamically adjusting the directory's configuration to achieve the best possible fit while maintaining performance and minimizing overhead. The system supports flexible directory configurations, allowing adaptation to varying workloads and memory access patterns. By dynamically optimizing directory dimensions, the invention improves memory management efficiency, reduces latency, and enhances overall system performance. The solution is particularly useful in systems requiring scalable and adaptable memory mapping, such as high-performance computing or virtualized environments.
6. The integrated circuit of claim 1 , wherein: the configuration parameters include a number of host tags to employ to map between entries in the real address-based directory and data storage locations the effective address-based accelerator cache.
The invention relates to integrated circuits, specifically those incorporating accelerator caches that use both real address-based directories and effective address-based data storage. The problem addressed is efficiently managing mappings between these two address spaces to improve performance and reduce complexity in accelerator cache operations. The integrated circuit includes an accelerator cache with a real address-based directory and an effective address-based data storage. Configuration parameters control the operation of this cache, including the number of host tags used to map entries in the real address-based directory to corresponding data storage locations in the effective address-based cache. These host tags facilitate the translation between real and effective addresses, ensuring accurate data retrieval and storage. The configuration parameters may also include other settings that optimize cache performance, such as tag sizes, associativity, or replacement policies. The system dynamically adjusts these parameters to balance latency, power consumption, and storage efficiency, particularly in heterogeneous computing environments where accelerators interact with host processors. This approach enhances cache coherence and reduces overhead in address translation, improving overall system efficiency.
7. A system, comprising: the integrated circuit of claim 6 ; the accelerator unit coupled to the integrated circuit via the second communication interface, wherein: the accelerator cache includes a cache configuration register for storing one of the configuration parameters; and the cache configuration register specifies a value for the configuration parameter expressed as a positive integer exponent of 2.
The system involves an integrated circuit with an accelerator unit designed to enhance computational performance. The integrated circuit includes a processor and an accelerator cache, which is a specialized memory unit optimized for high-speed data access. The accelerator unit is connected to the integrated circuit through a communication interface, enabling efficient data transfer between the processor and the accelerator cache. The accelerator cache includes a configuration register that stores parameters defining its operational settings. One such parameter is expressed as a positive integer exponent of 2, allowing for precise control over cache size, associativity, or other performance-related attributes. This configuration ensures that the cache operates efficiently by aligning with the processor's data access patterns, reducing latency, and improving overall system throughput. The system is particularly useful in applications requiring high-performance computing, such as data processing, machine learning, or real-time analytics, where rapid data retrieval and processing are critical. By dynamically adjusting the cache configuration, the system can adapt to varying workload demands, optimizing performance without requiring hardware modifications. The use of an exponent-based parameter simplifies configuration while maintaining flexibility in cache management.
8. A method of data processing in a coherent data processing system including a system memory, the method comprising: host attach logic communicating memory access requests with the coherent data processing system via a first communication interface and communicating, via a second communication interface, memory access requests and request responses with an accelerator unit including an effective address-based accelerator cache for buffering copies of data from the system memory; the host attach logic recording, in a real address-based directory inclusive of contents of the accelerator cache, addresses of data from the system memory accessed by the accelerator unit, wherein the recording includes assigning entries in the real address-based directory based on real addresses utilized to identify storage locations in the system memory; and configuring at least a number of congruence classes utilized in the real address-based directory based on configuration parameters provided by or on behalf of the accelerator unit.
This invention relates to data processing in a coherent system with a host processor and an accelerator unit. The system includes a system memory and a host attach logic that manages communication between the host and the accelerator. The accelerator unit has an effective address-based cache that stores copies of data from the system memory. The host attach logic forwards memory access requests between the host and the accelerator via separate communication interfaces. It also maintains a real address-based directory that tracks which data from the system memory is cached in the accelerator. The directory records real addresses, which correspond to physical storage locations in the system memory, rather than effective addresses used by the accelerator. The directory is organized into congruence classes, and the number of these classes can be adjusted based on configuration parameters provided by the accelerator. This approach ensures coherence between the host and accelerator by tracking cached data in the system memory's physical address space, while allowing flexibility in directory sizing to optimize performance. The system avoids conflicts and improves efficiency by dynamically configuring the directory structure according to the accelerator's requirements.
9. The method of claim 8 , and further comprising configuring a number of ways per congruence class utilized in the real address-based directory based on the configuration parameters.
A method for optimizing memory access in a computing system involves managing a real address-based directory to improve performance and reduce resource usage. The directory is used to track memory accesses and maintain coherence in a multi-core or distributed memory environment. The method includes dynamically configuring the number of ways (or entries) per congruence class within the directory based on system parameters such as workload characteristics, memory access patterns, or performance metrics. This adaptive configuration allows the directory to balance between storage efficiency and hit rates, ensuring optimal performance under varying conditions. By adjusting the number of ways per congruence class, the system can reduce unnecessary storage overhead while maintaining high accuracy in tracking memory accesses. This approach is particularly useful in systems where memory access patterns vary significantly, such as in data centers or high-performance computing environments. The method may also include monitoring system performance to determine when adjustments to the directory configuration are needed, ensuring continuous optimization. The overall goal is to enhance memory access efficiency while minimizing hardware complexity and power consumption.
10. The method of claim 9 , wherein the configuring includes first attempting to configure a desired number of ways in the real address-based directory and then to attempting to configure a desired number of congruence classes.
This invention relates to memory management systems, specifically optimizing address translation in a computing system by dynamically configuring directory structures. The problem addressed is inefficient address translation due to rigid directory configurations that do not adapt to varying workload demands, leading to performance bottlenecks. The method involves dynamically configuring a directory structure in a memory management system to improve address translation efficiency. The directory structure includes a real address-based directory and a congruence class-based directory. The configuration process first attempts to allocate a desired number of ways in the real address-based directory. If this allocation is not possible or optimal, the method then attempts to allocate a desired number of congruence classes in the congruence class-based directory. This two-step approach ensures that the directory structure adapts to the specific needs of the workload, balancing between real address-based and congruence class-based translations to minimize translation latency and maximize throughput. The real address-based directory allows for direct address translation by mapping real addresses to physical memory locations, while the congruence class-based directory groups addresses into congruence classes to reduce the number of translations required. By dynamically configuring the number of ways in the real address-based directory and the number of congruence classes in the congruence class-based directory, the system can optimize performance based on the current workload characteristics. This adaptive configuration improves overall system efficiency by reducing translation overhead and enhancing memory access speed.
11. The method of claim 8 , and further comprising: based on the configuration parameters specifying directory dimensions greater than a maximum physical size of the real address-based directory, configuring the real address-based directory to use the maximum physical size.
A method for managing directory structures in a computing system addresses the challenge of efficiently organizing and accessing data when directory dimensions exceed physical storage constraints. The method involves configuring a real address-based directory, which maps virtual addresses to physical memory locations, to handle directory dimensions that may be larger than the maximum physical size supported by the system. When the specified directory dimensions surpass this maximum physical size, the method automatically adjusts the directory configuration to use the maximum allowable physical size, ensuring compatibility with hardware limitations while maintaining data integrity. This approach prevents system errors or performance degradation that could occur if the directory dimensions were not properly constrained. The method may also include additional steps such as validating the configuration parameters, dynamically adjusting directory dimensions during operation, and optimizing memory allocation to improve system efficiency. By enforcing physical size limits, the method ensures reliable directory operations even when virtual address spaces are larger than the available physical memory.
12. The method of claim 8 , wherein the configuring includes iteratively attempting a best fit of desired directory dimensions indicated by the configuration parameters within a maximum physical size of the real address-based directory.
This invention relates to optimizing directory structures in computing systems, particularly for managing memory access and caching efficiency. The problem addressed is the mismatch between desired directory dimensions (such as size, depth, or organization) and the physical constraints of real address-based directories, which can lead to inefficiencies in memory access and caching performance. The method involves configuring a directory structure by iteratively attempting to fit desired directory dimensions within the maximum physical size allowed by the real address-based directory. The configuration parameters define the desired dimensions, such as the number of entries, hierarchy levels, or other structural attributes. The iterative process adjusts these dimensions to achieve the best possible fit while adhering to the physical constraints. This ensures that the directory structure remains efficient and functional within the available hardware resources. The method may also include dynamically adjusting the directory structure based on runtime conditions, such as memory access patterns or workload changes, to maintain optimal performance. By balancing desired dimensions with physical limitations, the invention improves memory management and caching efficiency in computing systems.
13. The method of claim 8 , wherein: the configuration parameters include a number of host tags to employ to map between entries in the real address-based directory and data storage locations the effective address-based accelerator cache.
A method for optimizing data storage and retrieval in a computing system involves using configuration parameters to manage mappings between a real address-based directory and an effective address-based accelerator cache. The method employs a specified number of host tags to facilitate these mappings, ensuring efficient data access. The real address-based directory tracks physical memory locations, while the accelerator cache operates using effective addresses, which are virtual or logical addresses. The host tags serve as identifiers or metadata that help correlate entries in the directory with corresponding storage locations in the cache. This approach improves performance by reducing latency in data retrieval and enhancing the accuracy of address translations. The method is particularly useful in systems where fast access to cached data is critical, such as in high-performance computing or real-time processing environments. By dynamically adjusting the number of host tags, the system can balance between storage efficiency and access speed, adapting to varying workload demands. The technique ensures that data is correctly mapped and retrieved, even as the system scales or undergoes configuration changes.
14. The method of claim 8 , and further comprising specifying a configuration parameter in a cache configuration register, wherein the cache configuration register specifies a value for the configuration parameter expressed as a positive integer exponent of 2.
This invention relates to cache memory systems in computing devices, specifically addressing the challenge of efficiently managing cache configurations to optimize performance and resource utilization. The method involves dynamically adjusting cache parameters to improve data access efficiency. A key aspect is the use of a cache configuration register that stores a configuration parameter, which is defined as a positive integer exponent of 2. This parameter determines the size or behavior of the cache, allowing for precise control over cache operations. The configuration register enables the system to dynamically adjust cache settings based on workload demands, ensuring optimal performance without manual intervention. By expressing the parameter as a power of 2, the system simplifies calculations and ensures compatibility with binary-based computing architectures. This approach enhances flexibility in cache management, allowing the system to adapt to varying workloads while maintaining efficiency. The method is particularly useful in high-performance computing environments where cache optimization is critical for minimizing latency and maximizing throughput. The invention provides a scalable solution for dynamically configuring cache memory, improving overall system performance and resource utilization.
15. A design structure tangibly embodied in a storage device for designing, manufacturing, or testing an integrated circuit, the design structure comprising: host attach logic, including: a first communication interface for communicatively coupling the integrated circuit with the coherent data processing system; a second communication interface for communicatively coupling the integrated circuit with an accelerator unit including an effective address-based accelerator cache for buffering copies of data from the system memory of the coherent data processing system; a set-associative real address-based directory inclusive of contents of the accelerator cache, wherein the real address-based directory assigns entries based on real addresses utilized to identify storage locations in the system memory; and directory control logic that configures at least a number of congruence classes utilized in the real address-based directory based on configuration parameters provided by or on behalf of the accelerator unit.
This invention relates to integrated circuit design for coherent data processing systems, addressing the challenge of efficiently managing data coherence between a host system and accelerator units. The design structure includes host attach logic that facilitates communication between an integrated circuit and both a coherent data processing system and an accelerator unit. The accelerator unit features an effective address-based accelerator cache for buffering data from the system memory. A set-associative real address-based directory is included, which tracks the contents of the accelerator cache and assigns entries based on real addresses used to identify storage locations in the system memory. This directory ensures data consistency by maintaining a record of cached data. The design also includes directory control logic that dynamically configures the number of congruence classes in the real address-based directory based on configuration parameters provided by or on behalf of the accelerator unit. This adaptability allows the system to optimize performance and resource utilization based on varying workloads and system requirements. The overall system enhances data coherence and efficiency in heterogeneous computing environments where accelerators interact with host processors.
16. The design structure of claim 15 , wherein the directory control logic further configures a number of ways per congruence class utilized in the real address-based directory based on the configuration parameters.
This invention relates to a directory control logic system for managing memory access in a computing environment, particularly focusing on optimizing directory structures for efficient address-based lookups. The problem addressed involves improving the performance and flexibility of directory-based memory systems by dynamically adjusting the number of ways per congruence class in a real address-based directory. Traditional systems often use fixed configurations, which can lead to inefficiencies in memory access and resource utilization. The invention includes a directory control logic that dynamically configures the number of ways per congruence class in a real address-based directory. This configuration is based on predefined or runtime-adjustable parameters, allowing the system to adapt to varying workloads and memory access patterns. The directory structure is organized into congruence classes, where each class represents a grouping of memory addresses. The number of ways (or entries) per class can be adjusted to balance between memory usage and lookup efficiency. For example, increasing the number of ways per class may reduce collisions but consume more memory, while decreasing it may save memory but increase collisions. The configuration parameters can be set during system initialization or modified dynamically based on performance metrics, such as cache hit rates or memory access latency. This adaptability ensures optimal performance across different operating conditions. The invention enhances memory system efficiency by dynamically optimizing directory configurations, reducing overhead, and improving overall system responsiveness.
17. The design structure of claim 16 , wherein: the directory control logic is configured to first attempt to configure a desired number of ways in the real address-based directory and then to attempt to configure a desired number of congruence classes.
This invention relates to a design structure for a cache memory system, specifically addressing the challenge of efficiently managing directory resources in a multi-way set-associative cache. The system includes directory control logic that dynamically configures the cache's directory structure to optimize performance and resource utilization. The directory is organized using a real address-based directory, which maps memory addresses to cache entries, and supports multiple congruence classes to improve hit rates and reduce conflicts. The directory control logic is designed to prioritize the configuration of the number of ways in the real address-based directory before adjusting the number of congruence classes. This sequential approach ensures that the cache can first maximize the number of parallel access paths (ways) available for each memory address, which enhances data retrieval efficiency. Once the desired number of ways is established, the logic then configures the congruence classes, which group cache entries to minimize conflicts and improve cache hit rates. This two-step configuration process allows the system to balance between reducing access latency and maintaining high cache efficiency. The design structure is particularly useful in high-performance computing environments where cache performance directly impacts overall system throughput.
18. The design structure of claim 17 , wherein the directory control logic, based on the configuration parameters specifying directory dimensions greater than a maximum physical size of the real address-based directory, configures the real address-based directory to use the maximum physical size.
The invention relates to directory control logic for managing a real address-based directory in a computing system. The problem addressed is the need to handle configuration parameters that specify directory dimensions exceeding the maximum physical size of the real address-based directory. When such parameters are provided, the directory control logic ensures the directory operates within its physical limits by configuring it to use the maximum physical size available. This prevents system errors or inefficiencies that could arise from attempting to allocate resources beyond the directory's physical capacity. The directory control logic dynamically adjusts the directory's configuration to maintain compatibility with hardware constraints while optimizing performance. The system may include additional components such as memory controllers, cache systems, or other hardware interfaces that interact with the directory to manage data access and coherence. The invention ensures reliable operation by enforcing physical size limits, even when software or higher-level configurations request larger dimensions. This approach is particularly useful in systems where hardware constraints must be respected to avoid failures or performance degradation. The solution provides a robust mechanism for directory management in computing architectures where address-based directories are used for data tracking or caching.
19. The design structure of claim 15 , wherein the directory control logic is configured to iteratively attempt a best fit of desired directory dimensions indicated by the configuration parameters within a maximum physical size of the real address-based directory.
The invention relates to directory structures in computing systems, specifically addressing the challenge of efficiently managing directory dimensions within physical constraints. The system includes a directory control logic that dynamically adjusts directory dimensions based on configuration parameters while ensuring the directory remains within a predefined maximum physical size. The directory is real address-based, meaning it maps to physical memory locations rather than virtual addresses. The control logic iteratively attempts to optimize the directory dimensions to achieve a best fit, balancing performance and resource usage. This involves dynamically adjusting parameters such as directory depth, width, or associativity to meet performance targets while staying within hardware limits. The system ensures that the directory operates efficiently without exceeding its physical size constraints, improving memory management and system performance. The iterative best-fit approach allows for adaptive optimization, making the directory structure more flexible and scalable. This solution is particularly useful in systems where memory resources are limited or where performance must be finely tuned to specific workloads.
20. The design structure of claim 15 , wherein: the configuration parameters include a number of host tags to employ to map between entries in the real address-based directory and data storage locations the effective address-based accelerator cache.
The invention relates to a system for managing memory mappings in a computing environment, particularly for optimizing data access in accelerator caches. The problem addressed is the inefficiency in mapping between real address-based directories and accelerator cache storage locations, which can lead to performance bottlenecks in high-speed data processing systems. The design structure includes a mechanism for configuring parameters that determine how host tags are used to map entries in a real address-based directory to data storage locations in an effective address-based accelerator cache. The configuration parameters define the number of host tags employed in this mapping process, allowing for flexible and efficient address translation. This ensures that data accessed by the accelerator cache is correctly aligned with the real address space, improving data retrieval speed and reducing latency. The system dynamically adjusts the mapping based on the configured number of host tags, enabling optimal use of cache resources. This approach enhances performance in systems where accelerators, such as GPUs or FPGAs, require fast and accurate data access from memory. By optimizing the mapping process, the invention reduces overhead and improves overall system efficiency. The design is particularly useful in environments where real-time data processing and low-latency access are critical.
Unknown
September 1, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.