A method of operating an application processor including a central processing unit (CPU) with at least one core and a memory interface includes measuring, during a first period, a core active cycle of a period in which the at least one core performs an operation to execute instructions and a core idle cycle of a period in which the at least one core is in an idle state, generating information about a memory access stall cycle of a period in which the at least one core accesses the memory interface in the core active cycle, correcting the core active cycle using the information about the memory access stall cycle to calculate a load on the at least one core using the corrected core active cycle, and performing a DVFS operation on the at least one core using the calculated load on the at least one core.
Legal claims defining the scope of protection, as filed with the USPTO.
1. A method of operating an application processor comprising a central processing unit (CPU) with at least one core and a memory interface, the method comprising: measuring, during a first period, a core active cycle of a period in which the at least one core performs an operation to execute instructions and a core idle cycle of a period in which the at least one core is in an idle state; generating information about a memory access stall cycle of a period in which the at least one core accesses the memory interface in the core active cycle; correcting the core active cycle using the information about the memory access stall cycle to calculate a load on the at least one core using the corrected core active cycle; and performing a dynamic voltage and frequency scaling (DVFS) operation on the at least one core using the calculated load on the at least one core, wherein generating the information about the memory access stall cycle comprises: generating a cycle per instruction (CPI) indicating a cycle required to execute one instruction during the core active cycle; comparing the CPI a threshold CPI to generate the information about the memory access stall cycle; and subtracting the threshold CPI from the CPI to generate stall cycle per instruction (SPI), when the CPI exceeds the threshold CPI, and wherein the SPI indicates a cycle required to access the memory interface by the one instruction during the core active cycle.
2. The method of claim 1 , wherein the information about the memory access stall cycle comprise the memory access stall cycle, and calculating the load on the at least one core comprises subtracting the memory access stall cycle from the core active cycle to correct the core active cycle.
3. The method of claim 1 , wherein performing the DVFS operation comprises calculating the load on the at least one core by using the corrected core active cycle and the core idle cycle.
4. The method of claim 1 , wherein generating the CPI comprises: counting a number of instructions executed during the core active cycle; and generating the CPI by using the core active cycle and the counted number of instructions.
5. The method of claim 1 , wherein the threshold CPI is set by using at least one of a plurality of candidate active cycles generated when the at least one core executes a plurality of instructions that do not need an access operation with respect to the memory interface and performs a loop measuring the at least one of the plurality of candidate active cycles required to execute the plurality of instructions a plurality of times.
6. The method of claim 5 , wherein the threshold CPI is a cycle that is required to execute one instruction during a selected candidate active cycle, which has a longest length among the plurality of candidate active cycles.
7. The method of claim 1 , wherein performing the DVFS operation comprises correcting the core active cycle using the CPI and the SPI.
8. The method of claim 6 , wherein the information about the memory access stall cycle is not generated when the CPI is less than or equal to the threshold CPI.
9. The method of claim 8 , wherein the load on the at least one core is calculated by using the core active cycle and the core idle cycle without correcting the core active cycle when the CPI is less than or equal to the threshold CPI.
10. The method of claim 1 , wherein the memory interface is connected to a memory device, the memory interface and the memory device are included in the same memory clock domain, and the method further comprises: measuring, during a second period, a memory active cycle comprising a data transaction cycle of a period in which a data input/output operation is performed using the memory device and a ready operation cycle of a period in which an operation required to perform the data input/output operation is performed; calculating a load on the memory clock domain using the memory active cycle; and performing the DVFS operation on the memory interface using the calculated load on the memory clock domain.
11. The method of claim 10 , wherein the at least one core and the memory interface are operated using clock signals having different frequencies from one another and voltages having different levels from one another.
12. A method of operating a computing system comprising a plurality of master intellectual properties (IPs), a memory device, and a memory interface, the method comprising: measuring, during a predetermined period, a memory active cycle comprising a data transaction cycle of a period in which the memory interface performs a data input/output operation using the memory device in response to a request from at least one of the master IPs and a ready operation cycle of a period in which an operation required to perform the data input/output operation is performed; calculating a load on a memory clock domain comprising the memory device and the memory interface using the memory active cycle; and performing a dynamic voltage and frequency scaling (DVFS) operation on the memory interface and the memory device using the load on the memory clock domain, wherein measuring the memory active cycle comprises measuring a cycle from a first time point when the request from the at least one of the master IPs reaches the memory clock domain to a second time point when the data input/output operation is completed.
13. The method of claim 12 , wherein the load on the memory clock domain is calculated by using a ratio between the memory active cycle and a length of the predetermined period.
14. The method of claim 12 , wherein the memory device comprises one of a dynamic random access memory, a flash memory, a phase-change random access memory, a magnetoresistive random access memory, a resistive random access memory, or a ferroelectric random access memory.
15. The method of claim 12 , wherein, when the memory device is a dynamic random access memory, the ready operation cycle is a cycle for an operation of amplifying data by using a sense amplifier included in the memory device to input and output the data and an operation of precharging memory cells included in the memory device.
16. An application processor comprising: a memory interface connected to at least one external memory device; an input/output interface connected to at least one external master intellectual property (IP); a multi-core CPU including a plurality of cores; and a memory configured to store a dynamic voltage and frequency scaling (DVFS) program, wherein each of the plurality of cores is configured to correct a core active cycle of a period in which an operation is performed to execute instructions during a first period by using information about a memory access stall cycle of a period in which each core accesses the memory interface within the core active cycle and to execute a program stored in the memory to perform a DVFS operation using the corrected core active cycle, wherein the plurality of cores includes a first plurality of cores in a first cluster and a second plurality of cores in a second cluster, wherein the first cluster and the second cluster use a first threshold cycle per instruction (CPI) and a second threshold CPI, respectively, to perform the DVFS operation, and wherein the first threshold CPI and the second threshold CPI are set depending on the performance of the first and second plurality of cores, respectively.
Cooperative Patent Classification codes for this invention. Click any code to explore related patents in that topic.
October 30, 2017
August 18, 2020
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.