Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A computer system comprising a host, the computer system comprising: a memory comprising a return target cache implemented as an array having a size that is dynamically adjustable; and one or more processors coupled to the memory, the one or more processors being configured to: execute instructions in an output language (OL); run a guest, the guest configured to issue a sequence of instructions in an input language (IL); dynamically adjust the size of the return target cache; and run a binary translator configured to convert the sequence of IL instructions of the guest into a corresponding sequence of OL instructions of the host, wherein the binary translator converts a first sequence of IL instructions of the guest including a call statement that directs execution to a subroutine having an IL procedure entry address P and an IL return address R, into a first sequence of OL instructions of the host by: (i) translating the call statement into a call block of instructions, the call block pushing the IL return address R onto a stack and storing a known OL return address corresponding to the IL return address R in the return target cache at a location corresponding to a value derived from the IL procedure entry address P, and (ii) inserting a confirm block of instructions at an address following the call block, the confirm block determining a hit if an address popped from the stack matches the IL return address R, wherein the one or more processors, responsive to the confirm block determining the hit during execution of the first sequence of OL instructions, retrieve the known OL return address from the return target cache at the location corresponding to the value derived from the IL procedure entry address P and continue execution of the first sequence of OL instructions at the retrieved known OL return address.
A computer system includes a host with a memory and one or more processors. The memory contains a return target cache implemented as a dynamically adjustable array. The processors execute instructions in an output language (OL) and run a guest system that issues instructions in an input language (IL). The system dynamically adjusts the size of the return target cache and uses a binary translator to convert IL instructions from the guest into OL instructions for the host. When the binary translator encounters a call statement in the guest's IL instructions, it converts this into a sequence of OL instructions. The call statement directs execution to a subroutine with an IL procedure entry address (P) and an IL return address (R). The translator generates a call block that pushes the IL return address (R) onto a stack and stores a known OL return address (corresponding to R) in the return target cache at a location derived from the IL procedure entry address (P). A confirm block is inserted after the call block to check if the address popped from the stack matches the IL return address (R). If a match (hit) is found, the system retrieves the known OL return address from the cache and continues execution at that address. This approach optimizes performance by reducing the need for repeated address translations during subroutine calls.
2. The computer system of claim 1 , wherein the binary translator is configured to convert each of one or more other N sequences of IL instructions of the guest including a call statement that directs execution to a subroutine having an IL procedure entry address Pn and an IL return address Rn (where n=1, 2, . . . , N), into a corresponding sequence of OL instructions of the host.
The invention relates to a computer system that translates intermediate-level (IL) instructions from a guest system into object-level (OL) instructions for execution on a host system. The system addresses the challenge of efficiently translating and executing guest code on a host architecture, particularly when the guest code includes subroutine calls. The binary translator within the system converts sequences of IL instructions, including those containing call statements, into corresponding sequences of OL instructions. Each call statement in the guest IL instructions directs execution to a subroutine with an IL procedure entry address (Pn) and an IL return address (Rn), where n represents the sequence number. The translation process ensures that the translated OL instructions maintain the correct execution flow, including proper handling of subroutine calls and returns. This allows the host system to execute the translated code seamlessly, preserving the functionality of the original guest code while optimizing performance for the host architecture. The system is designed to handle multiple sequences of IL instructions, each containing subroutine calls, and convert them into executable OL instructions for the host.
3. The computer system of claim 2 , wherein the confirm block determines a miss if the address popped from the stack does not match the IL return address R or any of the other IL return addresses Rn (where n=1, 2, . . . , N), and responsive to the confirm block determining the miss, executes a miss/failure handler to recover a correct OL return address.
This invention relates to computer systems that handle indirect branch predictions, specifically addressing the challenge of accurately predicting return addresses in indirect branch operations, such as function returns in programming. The system includes a stack that stores intermediate language (IL) return addresses and a confirm block that verifies whether the address popped from the stack matches the expected indirect language (IL) return address or any of the other IL return addresses stored in the system. If the popped address does not match any of the expected return addresses, the confirm block determines a miss and triggers a miss/failure handler. The miss/failure handler then recovers the correct original language (OL) return address, ensuring program execution continues correctly. The system improves the reliability of indirect branch predictions by validating return addresses against stored expectations and correcting mismatches to prevent execution errors. This approach is particularly useful in environments where indirect branches are frequent, such as in virtual machine or just-in-time compilation scenarios, where accurate return address prediction is critical for performance and correctness.
4. The computer system of claim 3 , wherein the one or more processors are configured to track a number of times the miss/failure handler executes during execution of OL instructions converted from IL instructions of the guest, and adjust the size of the return target cache based on said number of times.
This invention relates to computer systems that execute guest code, particularly in virtualized or emulated environments where guest instructions (IL instructions) are converted to host instructions (OL instructions). The system includes a miss/failure handler that manages execution failures during this conversion process. A key challenge is optimizing performance by reducing the frequency of handler invocations, which can slow down execution. The system tracks the number of times the miss/failure handler is invoked during the execution of converted instructions. Based on this count, the system dynamically adjusts the size of a return target cache, which stores frequently accessed return addresses or execution targets. Increasing the cache size when handler invocations are frequent improves efficiency by reducing the need for repeated handler calls. Conversely, reducing the cache size when handler invocations are rare conserves resources. This adaptive approach balances performance and resource usage, ensuring efficient execution of guest code in virtualized environments. The system may also include mechanisms to convert guest instructions to host instructions, handle execution failures, and manage the cache to optimize performance.
5. The computer system of claim 4 , wherein the one or more processors are configured to adjust the size of the return target cache based on the number of times the miss/failure handler executes during the execution of the OL instructions converted from IL instructions of the quest over a predetermined period of time.
This invention relates to optimizing the performance of a computer system executing object language (OL) instructions converted from intermediate language (IL) instructions in a runtime environment. The system includes a return target cache that stores return addresses for function calls to reduce lookup time and improve execution efficiency. The system monitors the frequency of cache misses or failures handled by a miss/failure handler during the execution of OL instructions over a predetermined time period. Based on this frequency, the system dynamically adjusts the size of the return target cache to balance memory usage and performance. If the miss/failure handler is frequently invoked, the cache size may be increased to reduce misses, while infrequent misses may lead to a smaller cache to conserve memory resources. The system ensures efficient execution by dynamically adapting the cache size in response to runtime behavior, improving overall system performance.
6. The computer system of claim 4 , wherein the one or more processors are configured to adjust the size of the return target cache based on the number of times the miss/failure handler executes during the execution of the OL instructions converted from IL instructions of the quest over a predetermined period of time, divided by a total number of subroutine calls made during the execution of the OL instructions converted from IL instructions of the quest.
The invention relates to optimizing performance in a computer system executing object-level (OL) instructions converted from intermediate-level (IL) instructions, particularly in handling subroutine calls and return targets. The system includes a return target cache that stores return addresses for subroutine calls to reduce overhead during execution. A miss/failure handler manages cases where the cache does not contain the expected return address, such as when a subroutine call is not cached or when an error occurs. The system dynamically adjusts the size of the return target cache based on the frequency of miss/failure handler executions relative to the total number of subroutine calls over a predetermined period. This adjustment ensures the cache size scales with the workload's demand, balancing memory usage and performance. The system monitors execution metrics to determine the optimal cache size, improving efficiency by reducing unnecessary cache misses and failures while maintaining low latency for subroutine returns. This approach is particularly useful in environments where subroutine calls are frequent, such as in virtual machine or just-in-time compilation scenarios.
7. A method for implementing subroutine calls and returns in a computer system having a host configured to execute instructions in an output language (OL), a guest communicatively coupled to the host and configured to issue a sequence of instructions in an input language (IL), a return target cache implemented as an array having a size that is dynamically adjusted, and a binary translator configured to convert the sequence of IL instructions of the guest into a corresponding sequence of OL instructions of the host, said method comprising: converting a first sequence of IL instructions of the guest including a call statement that directs execution to a subroutine having an IL procedure entry address P and an IL return address R, into a first sequence of OL instructions of the host by: (i) translating the call statement into a call block of instructions, the call block pushing the IL return address R onto a stack and storing a known OL return address corresponding to the IL return address R in the return target cache at a location corresponding to a value derived from the IL procedure entry address P, and (ii) inserting a confirm block of instructions at an address following the call block, the confirm block determining a hit if an address popped from the stack matches the IL return address R; and responsive to the confirm block determining the hit during execution of the first sequence of OL instructions, retrieving the known OL return address from the return target cache at the location corresponding to the value derived from the IL procedure entry address P and continuing execution of the first sequence of OL instructions at the retrieved known OL return address.
This technical summary describes a method for optimizing subroutine calls and returns in a computer system where a guest system executes instructions in an input language (IL) and a host system executes instructions in an output language (OL). The method addresses inefficiencies in translating and executing subroutine calls between different instruction sets, particularly when the guest and host use different languages. The system includes a host, a guest, a return target cache, and a binary translator. The binary translator converts IL instructions from the guest into OL instructions for the host. When a subroutine call is encountered, the method translates the call into a call block that pushes the IL return address onto a stack and stores a corresponding OL return address in the return target cache, indexed by a value derived from the subroutine's entry address. A confirm block is inserted to verify that the address popped from the stack matches the expected IL return address. If a match (hit) is detected, the method retrieves the pre-stored OL return address from the cache and continues execution at that address, improving performance by avoiding redundant address calculations. The return target cache dynamically adjusts its size to optimize memory usage and access speed. This approach reduces overhead in cross-language subroutine calls by leveraging cached return addresses.
8. The method of claim 7 , further comprising: converting each of one or more other N sequences of IL instructions of the guest including a call statement that directs execution to a subroutine having an IL procedure entry address Pn and an IL return address Rn (where n=1, 2, . . . , N), into a corresponding sequence of OL instructions of the host.
The invention relates to the field of virtual machine execution and specifically addresses the challenge of efficiently translating intermediate-level (IL) instructions from a guest virtual machine into optimized low-level (OL) instructions for execution on a host system. The method involves converting sequences of IL instructions, including those containing subroutine calls, into corresponding OL instructions. For each subroutine call in the guest IL code, the method processes a call statement that directs execution to a subroutine with an IL procedure entry address (Pn) and an IL return address (Rn), where n represents an index for multiple such sequences. The conversion ensures that the translated OL instructions maintain the same functionality as the original IL instructions while optimizing performance on the host system. This includes handling the call and return addresses correctly to preserve the control flow of the guest program. The method may also involve additional optimizations, such as inlining or reordering instructions, to improve execution efficiency. The overall goal is to enable seamless and efficient execution of guest virtual machine code on a host system by accurately translating and optimizing the instruction sequences.
9. The method of claim 8 , wherein the confirm block determines a miss if the address popped from the stack does not match the IL return address R or any of the other IL return addresses Rn (where n=1, 2, . . . , N), and responsive to the confirm block determining the miss, executes a miss/failure handler to recover a correct OL return address.
This invention relates to a method for managing return address prediction in a computing system, particularly in scenarios involving indirect branch instructions. The problem addressed is the potential for incorrect return address predictions when handling indirect branch instructions, which can lead to execution errors or performance degradation. The method involves a confirm block that verifies the accuracy of a predicted return address. The confirm block compares the address popped from a stack against a primary indirect branch return address (R) and any additional indirect branch return addresses (Rn, where n=1, 2, ..., N). If the popped address does not match any of these, the confirm block determines a miss. In response to a miss, a miss/failure handler is executed to recover the correct original language (OL) return address, ensuring accurate program execution. The method is part of a broader system that includes a return address predictor, a stack, and a confirm block. The predictor generates a predicted return address, which is then validated by the confirm block. The stack stores return addresses for indirect branch instructions, and the confirm block ensures that the correct address is used, preventing execution errors. The miss/failure handler corrects any mismatches, maintaining system reliability. This approach improves the accuracy and efficiency of return address prediction in computing systems.
10. The method of claim 9 , further comprising: tracking a number of times the host executes the miss/failure handler during execution of OL instructions converted from IL instructions of the guest; and adjusting the size of the return target cache based on said number of times.
This invention relates to optimizing the performance of virtualized systems by dynamically adjusting the size of a return target cache used in just-in-time (JIT) compilation for guest virtual machines. The problem addressed is the inefficiency in handling frequent misses or failures in the return target cache, which can degrade performance when executing optimized native (OL) instructions converted from intermediate language (IL) instructions of the guest. The method involves monitoring the frequency of miss/failure events in the return target cache during the execution of converted OL instructions. Based on this frequency, the system dynamically adjusts the cache size to balance memory usage and performance. A higher frequency of misses or failures may trigger an increase in cache size to reduce future misses, while a lower frequency may lead to a decrease in cache size to conserve memory resources. This adaptive approach ensures efficient use of system resources while maintaining optimal execution performance for the guest virtual machine. The technique is particularly useful in environments where the behavior of guest applications is unpredictable, requiring dynamic adjustments to cache configurations.
11. The method of claim 10 , wherein the size of the return target cache is adjusted based on the number of times the host executes the miss/failure handler during the execution of the OL instructions converted from IL instructions of the quest over a predetermined period of time.
This invention relates to optimizing the performance of a computing system by dynamically adjusting the size of a return target cache based on the frequency of miss/failure handler executions. The system converts intermediate language (IL) instructions into optimized native (OL) instructions for execution by a host processor. During execution, the host may encounter situations where the expected return target is not found in the cache, triggering a miss/failure handler. The invention monitors the frequency of these handler executions over a predetermined time period and dynamically adjusts the cache size accordingly. If the handler is invoked frequently, the cache size is increased to reduce future misses, improving performance. Conversely, if the handler is rarely invoked, the cache size may be decreased to conserve resources. The adjustment mechanism ensures the cache operates efficiently by balancing performance and resource usage. The system may also include a pre-fetching mechanism to further optimize cache performance by predicting and loading potential return targets before they are needed. This dynamic adjustment and pre-fetching work together to minimize execution overhead and enhance overall system efficiency.
12. The method of claim 10 , wherein the size of the return target cache is adjusted based on the number of times the host executes the miss/failure handler during the execution of the OL instructions converted from IL instructions of the quest over a predetermined period at time, divided by a total number of subroutine calls made during the execution of the OL instructions converted from IL instructions of the quest.
The invention relates to optimizing the performance of a computing system by dynamically adjusting the size of a return target cache based on execution behavior. The return target cache stores return addresses for subroutine calls, reducing the overhead of handling misses or failures when returning from subroutines. The system monitors the frequency of miss/failure handler executions during the execution of optimized low-level (OL) instructions, which are converted from intermediate-level (IL) instructions of a program or quest. The adjustment is made by calculating a ratio of the number of times the miss/failure handler is executed over a predetermined period to the total number of subroutine calls made during the execution of the OL instructions. This ratio determines whether to increase or decrease the cache size to balance performance and resource usage. The method ensures efficient cache utilization by adapting to the runtime behavior of the program, reducing unnecessary overhead while maintaining fast subroutine returns. The system may also include a pre-fetching mechanism to further improve performance by predicting and loading return addresses before they are needed. The overall goal is to enhance execution efficiency by dynamically optimizing cache resources based on real-time execution patterns.
13. A non-transitory computer readable medium embodying program instructions for implementing subroutine calls and returns in a computer system having a host configured to execute instructions in an output language (OL), a guest communicatively coupled to the host and configured to issue a sequence of instructions in an input language (IL), a return target cache implemented as an array having a size that is dynamically adjusted, and a binary translator configured to convert the sequence of IL instructions of the guest into a corresponding sequence of OL instructions of the host, the program instructions causing the computer system to perform a method comprising the steps of: converting a first sequence of IL instructions of the guest including a call statement that directs execution to a subroutine having an IL procedure entry address P and an IL return address R, into a first sequence of OL instructions of the host by: (i) translating the call statement into a call block of instructions, the call block pushing the IL return address R onto a stack and storing a known OL return address corresponding to the IL return address R in the return target cache at a location corresponding to a value derived from the IL procedure entry address P, and (ii) inserting a confirm block of instructions at an address following the call block, the confirm block determining a hit if an address popped from the stack matches the IL return address R; and responsive to the confirm block determining the hit during execution of the first sequence of OL instructions, retrieving the known OL return address from the return target cache at the location corresponding to the value derived from the IL procedure entry address P and continuing execution of the first sequence of OL instructions at the retrieved known OL return address.
This invention relates to optimizing subroutine calls and returns in a computer system where a guest system executes instructions in an input language (IL) and a host system executes instructions in an output language (OL). The challenge addressed is the inefficiency in handling subroutine calls and returns when translating between different instruction sets, particularly in virtualized or emulated environments. The system includes a host executing OL instructions, a guest issuing IL instructions, a binary translator converting IL instructions to OL instructions, and a dynamically adjustable return target cache implemented as an array. The binary translator processes a sequence of IL instructions containing a call statement that directs execution to a subroutine with an IL procedure entry address (P) and an IL return address (R). The call statement is translated into a call block of OL instructions that pushes the IL return address (R) onto a stack and stores a known OL return address corresponding to (R) in the return target cache at a location derived from (P). A confirm block of instructions is inserted after the call block to verify that the address popped from the stack matches (R). If a match (hit) is detected, the known OL return address is retrieved from the cache using the value derived from (P), and execution continues at this address. This approach reduces overhead by avoiding repeated calculations of return addresses and leveraging cached values.
14. The non-transitory computer readable medium of claim 13 , wherein the method further comprises the step of: converting each of one or more other N sequences of IL instructions of the guest including a call statement that directs execution to a subroutine having an IL procedure entry address Pn and an IL return address Rn (where n=1, 2, . . . , N), into a corresponding sequence of OL instructions of the host.
This invention relates to computer systems that execute intermediate-level (IL) instructions of a guest program on a host system using optimized low-level (OL) instructions. The problem addressed is the inefficiency in handling subroutine calls within guest programs when translated to host instructions, particularly ensuring correct execution flow and return addresses. The method involves converting sequences of IL instructions from the guest program into optimized OL instructions for the host. Specifically, it handles subroutine calls by processing each of one or more sequences of IL instructions that include a call statement. These call statements direct execution to a subroutine with an IL procedure entry address (Pn) and an IL return address (Rn), where n represents multiple instances. The conversion ensures that the translated OL instructions maintain the correct execution flow, including proper handling of subroutine entry and return addresses. This optimization improves performance by reducing overhead during subroutine calls while preserving the original program logic. The technique is particularly useful in virtualization, emulation, or cross-platform execution environments where guest code must be efficiently translated and executed on a host system.
15. The non-transitory computer readable medium of claim 14 , wherein the confirm block determines a miss if the address popped from the stack does not match the IL return address R or any of the other IL return addresses Rn (where n=1, 2, . . . , N), and responsive to the confirm block determining the miss, executes a miss/failure handler to recover a correct OL return address.
This invention relates to a system for managing indirect return addresses in a computing environment, particularly for handling indirect jumps or calls in intermediate language (IL) code that may lead to incorrect return addresses in the original language (OL) code. The problem addressed is the potential mismatch between IL return addresses and their corresponding OL return addresses, which can cause execution errors or security vulnerabilities. The system includes a stack for storing IL return addresses and a confirm block that verifies whether the address popped from the stack matches the expected IL return address (R) or any of the other IL return addresses (Rn, where n=1, 2, ..., N). If a mismatch occurs (a "miss"), the confirm block triggers a miss/failure handler to recover the correct OL return address. This ensures that the program execution resumes correctly, preventing crashes or security exploits due to incorrect return addresses. The confirm block may also include a pre-check block that determines whether the popped address matches any of the IL return addresses before the full confirmation process. If the pre-check block finds a match, the confirm block skips further verification, improving efficiency. The system may further include a return address table that stores the IL return addresses and their corresponding OL return addresses, allowing the miss/failure handler to retrieve the correct OL return address when a mismatch is detected. This mechanism enhances the reliability and security of indirect return address handling in compiled or interpreted code.
16. The non-transitory computer readable medium of claim 15 , wherein the method further comprises the steps of: tracking a number of times the host executes the miss/failure handler during execution of OL instructions converted from IL instructions of the guest; and adjusting the size of the return target cache based on said number of times.
The invention relates to optimizing the performance of virtualized computing environments, specifically addressing inefficiencies in handling exceptions or failures during the execution of guest instructions in a virtual machine. When a guest operating system or application executes instructions that cannot be directly processed by the host system, a miss/failure handler is invoked to manage these exceptions. Frequent invocation of this handler can degrade performance due to the overhead of context switching and emulation. The invention improves this process by dynamically adjusting the size of a return target cache, which stores frequently accessed return addresses or targets for the miss/failure handler. By tracking the number of times the handler is executed during the conversion of intermediate language (IL) instructions from the guest to optimized language (OL) instructions for the host, the system determines whether to increase or decrease the cache size. If the handler is invoked frequently, the cache size is expanded to reduce future lookups, whereas infrequent invocations may lead to a smaller cache to conserve resources. This adaptive approach balances performance and resource usage, minimizing the overhead of exception handling in virtualized environments.
17. The non-transitory computer readable medium of claim 16 , wherein the size of the return target cache is adjusted based on the number of times the host executes the miss/failure handler during the execution of the OL instructions converted from IL instructions of the quest over a predetermined period of time.
This invention relates to optimizing the performance of a return target cache in a computing system that converts intermediate language (IL) instructions to object language (OL) instructions for execution. The problem addressed is inefficient handling of return targets, which can lead to frequent cache misses and increased execution time when the host system repeatedly invokes a miss/failure handler due to incorrect or missing return target predictions. The system includes a return target cache that stores predicted return addresses for function calls. When a function call is encountered, the system checks the cache for a matching return target. If the cache does not contain the target (a miss), the host executes a miss/failure handler to determine the correct return address. The invention dynamically adjusts the size of the return target cache based on the frequency of miss/failure handler executions over a predetermined time period. If the handler is invoked frequently, the cache size is increased to reduce future misses. Conversely, if the handler is rarely invoked, the cache size may be decreased to conserve memory resources. This adaptive adjustment ensures that the return target cache operates efficiently, balancing memory usage and performance by dynamically responding to the system's execution patterns. The solution improves overall system efficiency by minimizing unnecessary handler invocations and reducing execution overhead.
18. The non-transitory computer readable medium of claim 16 , wherein the size of the return target cache is adjusted based on the number of times the host executes the miss/failure handler during the execution of the OL instructions converted from IL instructions of the quest over a predetermined period of time, divided by a total number of subroutine calls made during the execution of the OL instructions converted from IL instructions of the quest.
This invention relates to optimizing the performance of a virtual machine executing intermediate language (IL) instructions by dynamically adjusting the size of a return target cache. The problem addressed is the inefficiency in handling subroutine returns, particularly when the virtual machine frequently misses or fails to locate return addresses in the cache, leading to performance degradation. The system monitors the execution of object language (OL) instructions converted from IL instructions during the execution of a program or "quest." It tracks the number of times a miss/failure handler is invoked for return addresses over a predetermined period, divided by the total number of subroutine calls made during the same period. This ratio is used to dynamically adjust the size of the return target cache. If the ratio exceeds a threshold, indicating frequent misses, the cache size is increased to reduce future misses. Conversely, if the ratio is low, the cache size may be decreased to conserve memory resources. The adjustment mechanism ensures that the cache size is optimized for the workload, balancing between performance and memory usage. The system may also include additional features such as preloading frequently used return addresses into the cache or evicting less frequently used entries to improve efficiency. The dynamic adjustment helps maintain optimal performance across varying workloads without manual intervention.
Unknown
January 28, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.