Before power management behaviours can be discussed, it is important to understand the fundamental limitations of silicon integrated circuits. The primary purpose for power management in such devices is to ensure these limitations are not exceeded so that the reliability and functionality of devices are maintained. There are many factors that affect silicon-based transistor performance, but the focus here is on the most significant factors affecting x86 processors in their typical operating ranges.
Processor frequency is possibly the most obvious performance limiting factor. Frequency defines how fast the logic of the device is clocked, and how fast instructions are then executed. Performance will not be equivalent when comparing two processors of equivalent frequency and different architecture, but it is generally true that increasing frequency will increase execution performance.
Faster switching of the transistors requires increasing voltage to overcome the resistive and capacitive elements of the transistor. However, higher voltage increases ageing effects, putting practical limits on voltage application to ensure product longevity. Faster switching of transistors also generates higher currents as those capacitive elements are charged and discharged.
The combination of Ohm’s and Joule’s laws teach us that all this voltage and current generates power, and that both parameters have a direct relationship with power. In fact, the reality is that most processor frequency limitations also boil down to power or current limits. Faster switching of transistors increases current and may also require increasing voltage and doing either will increase power. Power limits are often the most significant performance limiting factor and as a result modern processors, based on the x86 architecture, tend to be power limited rather than frequency limited with heavy workloads.
As the processor operates, consumed power is converted to heat. Manufacturers will set maximum die temperatures for their products that must be followed. Maintaining this temperature limit is an important task for the power management entity in the processor.
Figure 1: Leakage power distribution for an undisclosed AMD product based on a 14nm FinFET process.
Another basic principle of silicon transistors is that they leak current across junctions and to the substrate. The amount of leakage current in a processor of a particular process type will vary largely by applied voltage and temperature and it can become quite significant in today’s high-performance processors. All this leakage current creates additional power that must be counted as part of the device’s total power consumption. Naturally, leakage power effectively reduces the amount of the device’s total power envelope that can be consumed as active power.
Workload power density
Understanding power management behaviour in complex microprocessors also requires understanding the concept of workload power density. This concept essentially means that different workloads will generate different amounts of power consumption in the processor, even at the same utilisation level. As an example, one can imagine that a complex floating-point calculation will trigger more transistor activity in the CPU than a simple data movement operation. The potential difference in power between workloads becomes even larger when considering that nearly all x86 microprocessors sold today are multi-core, and most have integrated many other functions that were previously external. Integration of the graphics processing unit (GPU) is the most significant, as it is a very large processing core on its own.
The data in this figure show that the power consumption of less power-dense workload was only 57% of Prime 95 with a single CPU core active. When extrapolated across multiple physical cores, it is easy to see that power variation by workload can grow quite large.
Defining power limits
Definition of the maximum power consumption is a common starting point when defining processor models. Manufactures choose power levels to address various use-cases with differing power restrictions, and performance (i.e.frequency) is largely derived from that. X86 processors are largely marketed by their Thermal Design Power (TDP), even though it is a specification related to the thermal solution requirement and not a maximum electrical power that the device can consume. Maximum sustainable power levels will be equal to or greater than TDP, depending on the product.
The power management controller of the processor monitors key parameters to ensure the processor specifications for maximum power, current and temperature are not exceeded. If changes in the operating scenario cause any one parameter to approach its limit, the controller must throttle the processor’s performance to compensate.
This throttling usually takes the form of reducing operating frequency of the core(s) consuming the largest amounts of power (i.e. CPU and GPU), as they have the biggest impact. These adjustments can happen as much as every millisecond for a very quick response to changes in the operating environment or even the workload. Since power consumption varies with the workload, one can recognise why achieving maximum frequency of a core may not always be possible.
The natural result for the power limited model is that performance is maximised for each workload, but frequency is not predictable with workload changes. System designers can avoid temperature throttling by developing enough headroom into the thermal solution to ensure that the maximum temperature is never reached. After all, the maximum sustained power level is a known quantity and airflow and ambient temperature limits can be specified for the final system. Yet, two samples of the same processor model could have differences in their leakage power, causing one unit to reach its power limit at a lower average frequency even when running an identical workload under identical operating conditions.
Vendors use this difference by allowing the lower leakage units to spend more time at higher frequency, yielding better performance. Also, different processor units of the same model can have different voltage requirements to achieve a given clock frequency. This difference can be exploited by fusing unit-specific voltage vs. frequency curves into each part that enable the power management controller to minimise core voltage.
In addition, many real-world PC use-cases have been found to be ‘bursty’, where applications often sit idle waiting for user input and then perform some activity before waiting again. This could be a user starting a program or loading a new web page. Some processors take advantage of this situation by defining a maximum power limit that is greater than the sustained power limit. The processor can be allowed to reach this higher power consumption for a short amount of time that is “thermally insignificant”. Increasing the power limit in this way allows for short periods of increased performance benefiting ‘bursty’ workloads.
Figure 2: The data in this figure show that the power consumption of less power-dense workload was only 57% of Prime 95 with a single CPU core active. When extrapolated across multiple physical cores, it is easy to see that power variation by workload can grow quite large.
Until recently, processor power management technology relied on power curves derived from actual power measurements at manufacturing test time with a reference workload. Values were programmed into the processor and combined with run-time data from complex activity monitors in the logic.
A recent change with AMD processors is the use of power telemetry data from the regulators powering the primary voltage rails. Real-time voltage and current data allow the power management unit to be much more accurate.
Doing so enables every variation of the unit that affects power consumption to be factored in along with instantaneous environmental circumstances and exploited for performance gain.