Novel Power Optimization Methods for AI/HPC Chips: Workload-Aware Adaptive Voltage Scaling
Andrea Matteucci, Product Manager - proteanTecs
As datacenter power consumption continues to pose cooling and cost challenges, and battery driven devices are expected to last longer between charges, the search for advanced power management mechanisms continues. Since dynamic power consumption is proportional to the frequency and to (the squared of) the operational voltage, power management methods focus on lowering the clock rate or/and the operating voltage, while still meeting the system performance requirements. Adaptive Voltage Scaling (AVS) automatically adjusts the supply voltage to the minimum value required for a target task. It uses in-chip structures to adaptively track the silicon behavior, so that the voltage can be adjusted to compensate for effects such as process variation, aging, and temperature. However, some effects cannot be effectively tracked in a conventional manner. Varying workloads may cause IR drops in different areas of a chip and in different magnitudes. Guard bands are taken based on worst case workload scenarios characterization (assuming they are known at all). More advanced solutions include critical path emulators that closely mirror the behavior of a chip. Since they are not subject to the same workloads as the real logic circuits they emulate, a safety margin must still be included. Due to these limitations of the monitoring structures, known AVS methods are limited in how much they can reduce the voltage and hence the power consumption. proteanTecs’ AVS Pro monitors the margin to timing failure of millions of real paths, in real time, under real workloads, allowing an optimal voltage reduction under any given workload at any point in time. It also provides an inherent safety-net, enabling fast frequency and voltage scaling to maintain error free functionality when events like voltage drops occur. In the talk, silicon results from companies already adopting this approach will be presented, showcasing 9%-14% power savings compared to conventional methods.