Proactive Thermal-Aware Scheduling
Type of Degreedissertation
MetadataShow full item record
Modern CPU’s cut-off operations when CPU temperature reaches a predetermined threshold making the CPU unavailable for all processes. Furthermore, operating the CPU for extended periods at temperatures close to, but slightly below, hardware cut-off, lowers reliability and lifetime of the CPU. In this dissertation, we develop proactive scheduling techniques to manage CPU temperatures by cutting off the major heat dissipating processes rather than the entire CPU. Such proactive scheduling promotes better component life, lower cooling fan usage, improved battery life and better availability. The techniques can be implemented over existing dynamic voltage and frequency scaling, dynamic power management, leakage energy and location-based techniques. Memory accesses and floating-point operations are two major heat-dissipating activities in many programs. The first proactive approach developed is called Proactive Thermal Aware Scheduler (PTAS). PTAS forms a temperature predictor using the regression of the time derivatives of the number of Floating-Point Operations per Second (FLOPS) and the current CPU temperature. The predictor is used to make proactive scheduling decisions to handle thermal emergency before the temperature reaches the hardware cut-off. If the value of the predictor for any process is above an empirically determined cut off, it is deemed likely that in the near future, the CPU will reach the hardware cut-off temperature. Therefore, that process is moved to the sleep state for a short duration. We analyzed the performance of PTAS using Scimark benchmarks in lowering CPU temperature. The reductions in peak temperatures were 2-4°C for FFT, LU, SOR, and Sparse (small) components of the Scimark benchmark runs respectively. For the larger versions of the aforementioned benchmark component runs, the reductions were 2-4°C respectively. The reductions in peak/average temperature on a laptop were 3-5/5°C. The corresponding penalties in schedule lengths were between 15-30%. The second approach is called Proactive Thermal Aware Scheduling with Floating-Points and Memory access rates (PTFM). In this approach, a future temperature impact predictor (TIP) for any process is formed using a regression of the time derivatives of FLOPS, memory accesses and current CPU temperature. If the TIP for any process goes above a predetermined threshold, that process is put to sleep for a short duration. We evaluated the scheduler on small and large components of FFT, LU, SOR and Sparse within the Scimark benchmark suite. We found decrease in peak/average CPU temperatures: 3-6°C/6°C for small benchmarks and 3-6°C/5°C for large benchmarks. The schedule length penalties were less than 2-10%. The corresponding results in peak/average temperature on a laptop were 3-6/6°C. We compared our results against other threshold based cut-off approaches: simple temperature, simple time derivative based cut-off strategies and PTAS. We found PTFM outperformed these strategies.