Thermal Modeling and Management of Storage Systems
by
Xunfei Jiang
A dissertation submitted to the Graduate Faculty of
Auburn University
in partial fulfillment of the
requirements for the Degree of
Doctor of Philosophy
Auburn, Alabama
August 2, 2014
Keywords: Thermal, Modeling, Management, Storage Systems
Copyright 2014 by Xunfei Jiang
Approved by
Xiao Qin, Associate Professor of Computer Science and Software Engineering
Cheryl Seals, Associate Professor of Computer Science and Software Engineering
David Umphress, Associate Professor of Computer Science and Software Engineering
Saad Biaz, Associate Professor of Computer Science and Software Engineering
Abstract
Energy consumption of data storage systems has increased significantly for the past
decades. There is an urgent need to build energy-efficient data storage systems. Computing
cost of IT facilities and cooling cost of air conditioners contribute to a large portion of the
total energy consumption of data centers. A large amount of researchers focus on reducing
the computing cost by balancing workload or powering off idle data nodes to save energy. In
recent years, growing attention has been paid to decreasing the cooling cost. Temperature is
a major contributor to cooling cost, and thermal management has become a popular topic
in building energy-efficient data centers. Extensive research of thermal impacts of processors
and memories has been presented in literature, however, the thermal impacts of disks have
not been fully investigated.
In this dissertation, experiments are conducted to characterize the thermal behavior of
processors and disks by using real-world benchmarks (e.g., postmark and whetstone). The
profiling results show that disks have comparable thermal impacts as processors to overall
temperature of a data node. Then, we develop an approach to generate thermal models
for estimating temperatures of processors, disks, and data nodes. We validate the thermal
models by comparing the predictions with real measurements by temperature sensors de-
ployed on data nodes. We further propose an energy model to estimate the total energy
cost of data nodes. Finally, by applying our thermal and energy models, we propose thermal
management strategies for building energy-efficient data centers. These strategies include
a thermal-aware task scheduling strategy, thermal-aware data placement strategies for ho-
mogeneous and hybrid storage clusters, and a predictive thermal-aware data transmission
strategy.
ii
Acknowledgments
This dissertation would not have been completed without invaluable guidance, experi-
ence sharing, constant support and encouragement from my advisor, people in our research
group and family members during my study at Auburn University.
First and foremost, I would like to give my most sincere and deepest gratitude to my
advisor, Dr. Xiao Qin, for his great efforts, trust and patience in my work. I will never
forget his extensive knowledge in the field of storage systems and inexhaustible enthusiasm
for research, which keeps inspiring and driving me to accomplish my research. When working
on the book chapter "Thermal Modeling and Management of Storage Systems in Data Cen-
ters", his insightful advice and suggestion helped and enlightened me in setting up accurate
motivations behind the research, building a thermal model and two concrete thermal-aware
strategies that can reduce cooling and energy costs in data centers.
I am also tremendously grateful to be advised by my committee members, Dr. Cheryl
Seals, Dr. David Umphress and Dr. Saad Biaz, who reviewed my proposal and dissertation
documents. They gave me a number of valuable suggestions, by which my dissertation had
been substantially improved. I would like to show my appreciation for Dr. Brian Thurow as
my university reader.
Working with our research group is fantastic. I owe my gratitude to Xiaojun Ruan,
Zhiyang Ding, Shu Yin, James Majors, Yun Tian, Jiong Xie, Yixian Yang, Maen Al Assaf,
Ji Zhang, Ajit Chavan, Tausif Muzaffar, Sanjay Kulkarni and Yuanqi Chen, who helped me
with paper writing, experimental result collection and group discussions. In addition, all the
professors and students in the Department of Computer Science and Software Engineering
are greatly appreciated, because an excellent atmosphere for study and research is created
and maintained by everyone.
iii
Finally and most importantly, the endless love from my family is the most power-
ful strength that keeps me fighting for my research. My mother Taochun Xie, my father
Jizhong Jiang and my husband Ji Zhang always stay with me, cheering for achievement and
overcoming all difficulties.
iv
To my parents
and Ji Zhang
v
Table of Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1 Energy-efficient Data Centers . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.1 Computing Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 Cooling Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Thermal Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 CPU Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.2 Disk Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.3 Memory Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Solid State Disk (SSD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Thermal Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.1 CPU Thermal Management . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.2 Memory Thermal Management . . . . . . . . . . . . . . . . . . . . . 15
2.4.3 Storage Thermal Management . . . . . . . . . . . . . . . . . . . . . . 16
2.4.4 NoC Thermal Management . . . . . . . . . . . . . . . . . . . . . . . 17
2.4.5 Predictive Thermal Management . . . . . . . . . . . . . . . . . . . . 18
vi
2.5 Data Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 Preliminary Thermal Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1 Thermal Impacts of Disk I/O . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.1 Testbed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.2 Impact of CPU and Disks on Inlet/Outlet Temperatures . . . . . . . 22
3.2 Thermal Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2.1 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2.2 An Inlet/Outlet Temperature Model . . . . . . . . . . . . . . . . . . 30
3.2.3 The COP Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4 Advanced Thermal Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1 Testbed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.2 Thermal Models of Disks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2.1 Ambient Impacts on Disk Temperatures . . . . . . . . . . . . . . . . 41
4.2.2 Various Number of Transactions . . . . . . . . . . . . . . . . . . . . . 41
4.2.3 Disk Temperatures under Various Utilizations . . . . . . . . . . . . . 49
4.2.4 Different Number of Disks . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3 Thermal Model of CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.4 Thermal Model of a Data Node . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.5 Evaluation of Temperature Models . . . . . . . . . . . . . . . . . . . . . . . 59
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5 Thermal-aware Task Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.1 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.2.1 CPU-intensive Workload . . . . . . . . . . . . . . . . . . . . . . . . . 69
vii
5.2.2 I/O-intensive Workload . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6 Thermal-aware Data Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.1 Homogeneous Disk Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.1.1 The Two-Disk Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.1.2 The Three-Disk Case . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.1.3 Data Placement Strategy . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.2 Hybrid Storage Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.2.1 System Configuration of Hybrid Storage . . . . . . . . . . . . . . . . 85
6.2.2 Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7 Predictive Thermal-aware Energy-efficient Data Transmission . . . . . . . . . . 90
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
7.2 Preliminary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
7.3 Framework of Predictive Thermal-aware Management System . . . . . . . . 102
7.3.1 Performance Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.3.2 Thermal Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.3.3 Computing Energy Power Model . . . . . . . . . . . . . . . . . . . . 105
7.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
8.1 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
8.1.1 Thermal Modeling of Disk Temperatures . . . . . . . . . . . . . . . . 111
8.1.2 Thermal Modeling of CPU Temperatures . . . . . . . . . . . . . . . . 111
8.1.3 Thermal Modeling and Energy Consumption of Data Nodes . . . . . 112
8.1.4 Thermal-aware Task Scheduling System . . . . . . . . . . . . . . . . 112
8.1.5 Data Placement in Homogeneous Disk Arrays . . . . . . . . . . . . . 112
viii
8.1.6 Data Placement in Hybrid Cluster Storage Systems . . . . . . . . . . 113
8.1.7 Predictive Thermal-aware Management System (PTMS) . . . . . . . 113
8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
8.2.1 Considering Ambient Temperatures . . . . . . . . . . . . . . . . . . . 114
8.2.2 Data Storage Nodes Equipped with Multiple Disks . . . . . . . . . . 114
8.2.3 Heterogeneous Data Centers . . . . . . . . . . . . . . . . . . . . . . . 114
8.2.4 Thermal Models for Hadoop Clusters . . . . . . . . . . . . . . . . . . 114
8.2.5 Energy-aware Hadoop Distributed File System . . . . . . . . . . . . . 115
8.2.6 Address Big Data Challenges . . . . . . . . . . . . . . . . . . . . . . 115
8.2.7 Security Issue of Data Storage Systems . . . . . . . . . . . . . . . . . 115
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
ix
List of Figures
3.1 Temperature evaluation under the low CPU and low disk utilizations. . . . . . . 24
3.2 Temperature evaluation under the high CPU and low disk utilizations. . . . . . 25
3.3 Temperature evaluation under the low CPU and high disk utilizations. . . . . . 27
3.4 Temperature evaluation under the high CPU and high disk utilizations. . . . . . 28
3.5 Framework of proposed solution. . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.6 Coefficient of the performance curve for the chilled-water CRAC units at the HP
Labs Utility Data Center [83]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.7 Three typical access patterns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.1 Disk temperatures are affected by ambient temperatures. . . . . . . . . . . . . . 42
4.2 Disk temperature of running different tasks. . . . . . . . . . . . . . . . . . . . . 44
4.3 Thermal characteristics of running 5000 transactions. . . . . . . . . . . . . . . . 46
4.4 Time comparison of disks running 5000 transactions. . . . . . . . . . . . . . . . 47
4.5 Temperature comparison of disks running 5000 transactions. . . . . . . . . . . . 47
4.6 Comparison of estimated disk temperatures with real measurements. . . . . . . 48
4.7 Western digital HDD?s utilizations with various write block sizes. . . . . . . . . 50
x
4.8 Western digital HDD?s temperatures with various write block sizes. . . . . . . . 50
4.9 Western digital HDD?s temperature model validation (write block size: 128 Byte). 51
4.10 Disk utilizations under different write block sizes. . . . . . . . . . . . . . . . . . 53
4.11 Peak disk temperatures under different write block sizes. . . . . . . . . . . . . . 53
4.12 Inlet/outlet temperature differences in the cases of different numbers of disks. . 54
4.13 CPU utilization under different scenarios. . . . . . . . . . . . . . . . . . . . . . 56
4.14 CPU temperature under different scenarios. . . . . . . . . . . . . . . . . . . . . 56
4.15 CPU temperature model validation (12000LOOPS). . . . . . . . . . . . . . . . . 58
4.16 Validation of the outlet temperature model. . . . . . . . . . . . . . . . . . . . . 59
4.17 CPU and disk utilizations for running WordCount. . . . . . . . . . . . . . . . . 60
4.18 CPU temperature model validation for WordCount. . . . . . . . . . . . . . . . . 62
4.19 Disk temperature model validation for WordCount. . . . . . . . . . . . . . . . . 62
5.1 The framework of thermal management system for task scheduling. . . . . . . . 66
5.2 Execution time and active time of data nodes under CPU-intensive Workloads. . 70
5.3 Energy consumption of data nodes under CPU-intensive Workloads. . . . . . . . 71
5.4 Execution time and active time of data nodes under I/O-intensive workloads. . 73
5.5 Energy consumption of data nodes under I/O-intensive workloads. . . . . . . . . 73
6.1 Thermal impacts of data placement in the two-disk case. . . . . . . . . . . . . . 76
xi
6.2 Thermal impacts of data placement in the three-disk case. . . . . . . . . . . . . 79
6.3 Thermal impacts of data placement in the three-disk case(2). . . . . . . . . . . 80
6.4 Thermal impacts of data placement in the three-disk case(3). . . . . . . . . . . 81
6.5 Thermal impacts of data placement in the three-disk case(4). . . . . . . . . . . 82
6.6 Two types of hybrid cluster storage systems. . . . . . . . . . . . . . . . . . . . . 86
7.1 Performance of transferring 1 text file in direct transmission. . . . . . . . . . . . 93
7.2 Performance of transferring 1 text file in archived transmission. . . . . . . . . . 94
7.3 Performance of transferring 1 text file in compressed transmission. . . . . . . . . 95
7.4 Performance of transferring Linux kernel files in direct transmission. . . . . . . . 97
7.5 Performance of transferring Linux kernel files in archived transmission. . . . . . 98
7.6 Performance of transferring Linux kernel files in compressed transmission. . . . 99
7.7 The framework of the predictive thermal-aware management system. . . . . . . 102
7.8 Framework of the energy predictor module. . . . . . . . . . . . . . . . . . . . . 103
7.9 Energy cost of data nodes in transferring the ASCII files. . . . . . . . . . . . . . 107
7.10 Energy cost of transferring the ASCII files under different transmission types. . 107
7.11 Energy cost of data nodes in transferring the Human Genome dataset. . . . . . 108
7.12 Energy cost of transferring the Human Genome dataset under different transmis-
sion types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
xii
List of Tables
3.1 Testbed Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 Experiment Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 Thermal Model Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.1 Testbed Configuration for Advanced Thermal Models . . . . . . . . . . . . . . . 40
4.2 Specifications of the Two Disks . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3 Configurations of Tasks with Various Number of Transactions . . . . . . . . . . 43
4.4 Execution Time of Running Three Different Tasks . . . . . . . . . . . . . . . . . 43
4.5 Parameters for Fitting Polynomial and Logarithmic Models to Disk Temperature
as a Function of Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.6 Postmark Configurations of Experiments on Disks . . . . . . . . . . . . . . . . . 49
4.7 Western Digital HDD?s Utilizations under Various Write Block Sizes . . . . . . 50
4.8 Parameters for Fitting Polynomial and Logarithmic Models to Western Digital
HDD?s Temperature under Various Utilizations . . . . . . . . . . . . . . . . . . 52
4.9 CPU Utilizations and Temperatures in the Steady Stage under Various Number
of Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.10 Parameters for the Linear Model to Estimate Outlet Temperatures as a Function
of CPU and Disk Temperatures. . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.11 Estimated Disk Temperatures under a Specific Utilization . . . . . . . . . . . . 61
5.1 Task Configurations of CPU-intensive Workloads . . . . . . . . . . . . . . . . . 69
5.2 Task Scheduling Schemes for CPU-intensive Workloads. . . . . . . . . . . . . . 70
5.3 Task Configurations of I/O-intensive Workloads. . . . . . . . . . . . . . . . . . . 72
5.4 Task Scheduling Schemes for I/O-intensive Workloads. . . . . . . . . . . . . . . 72
xiii
6.1 The Two-Disk Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.2 Peak Average Disk Temperatures and Total Task/Application Execution Times 77
6.3 Testbed Configuration for Three-Disk Case . . . . . . . . . . . . . . . . . . . . 78
6.4 The Three-Disk Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6.5 Peak Average Disk Temperatures, Execution Times and Estimated Cooling Costs. 83
7.1 Testbed Configuration for Data Transmission . . . . . . . . . . . . . . . . . . . 92
7.2 Summary of Single Text File Transmission . . . . . . . . . . . . . . . . . . . . . 96
7.3 Empirical Results of Transferring Linux Source Code Files . . . . . . . . . . . . 100
xiv
Chapter 1
Introduction
Thermalmodelingandmanagementtechniqueshavebeenwidelystudiedinrecentyears.
Research shows that thermal management could increase energy efficiency of data centers.
Previous research studied the thermal impacts of processors on data storage nodes; how-
ever, the thermal impacts of disks are not fully investigated. In our study, we consider the
thermal impacts of both processors and disks, and propose a thermal model to estimate the
outlet temperature of a data storage node based on the activities of processors and disks.
By applying our thermal model into an energy consumption model, we estimate the total
energy cost of data nodes. Furthermore, we also evaluate the impact of different thermal
management strategies on energy consumption.
Energyconsumptionofdatacentershasincreasedveryquicklyrecently[111][57], among
which the portion that computing cost and cooling cost take is growing significantly. Statis-
tics show that computing cost and cooling cost of data centers take up to 25% of the total
energy cost of data centers [20]. There have been studies to analyse the performance and
energy consumption of data centers. Chen et al. studied the task-based energy consumption
of cloud storage systems [34] and proposed StressCloud to analyse the performance and en-
ergy consumption of the systems [33]. Zhang and Fu presented power profiling results on a
cloud test bed by combining hardware and software that achieves power and energy profiling
at server granularity [120]. In order to reduce the energy cost of data centers, much effort
has been made to reduce the computing cost and cooling cost of storage systems.
1
1.1 Motivations
Our proposed thermal model is indispensable for next-generation storage clusters be-
cause of the following five factors:
1. the ever-increasing cooling and energy costs of large-scale storage clusters,
2. the impact of hybrid storage on thermal management of data centers,
3. the importance of reducing thermal monitoring cost,
4. the capability of estimating the cooling cost of a data center, and
5. the lack of study on the impacts of hard drives and solid state disks on outlet temper-
atures of storage nodes in a cluster.
Withtheincreaseofenergyconsumptionandcoolingcostsoflarge-scalestorageclusters,
there is an urgent need for data center designers to address the energy efficiency issues
[10]. Conventional energy-saving approaches for data centers include improving the energy
efficiency of computing facilities as well as cooling systems.
Cooling costs contribute a large portion of the total energy cost of data centers [49][10].
For instance, the power and cooling cost to support IT equipments take more than half of the
total energy cost of a data center [49]. Previous studies demonstrate that energy efficiency
could be enhanced by reducing the energy dissipation in cooling systems [83][113]. Reduc-
ing outlet temperatures or optimizing air recirculation can improve energy efficiency [108].
Moreover, load balancing strategies were proposed to gain good temperature distribution.
Recent studies show that reducing outlet temperatures of servers in a data center could save
up to 40% energy consumption [83]. Lowering outlet temperatures of storage nodes not only
conserves cooling cost, but also improves the reliability and lifetime of disks [90][116].
A handful of studies have focused on modeling the energy consumption of storage clus-
ters in the past years. For example, an energy model is introduced to estimate the power
2
consumption of storage nodes running under specific workloads [16]. Unfortunately, ther-
mal models of storage clusters are still in their infancy. Little attention has been paid to
the thermal impact of disks, including HDDs and SSDs, on the energy efficiency of cooling
systems in data centers.
Deploying temperature sensors on storage nodes of a cluster is an usual method to
monitor the storage cluster?s temperature. For each data node, one needs to apply at least
two sensors to obtain the inlet and outlet temperatures. If temperatures of other interior
devices of a node need to be monitored, additional sensors must be set up. Although this
traditional approach is practical for measuring temperatures of small-scale storage clusters,
it becomes a sophisticated solution when a storage cluster has thousands of nodes. It is
extremely expensive to set up a huge number of sensors in a large-scale storage cluster;
deployingsensorsalsoleadstoextraenergycost. Thermalmodelsareapromisingalternative
to obtain temperatures of storage clusters.
Building a data center is a huge investment for enterprises. Estimating the energy costs,
which include cooling cost and power cost, offers an important guideline in the designing
phase. Simulations and thermal models help data center designers make decisions on thermal
management during the planning phase.
Avarietyoffactorsimpacttheoutlettemperaturesofstoragenodes. Astudyshowsthat
inlet temperatures and CPU utilization affect the outlet temperatures of data nodes [108].
In a second study, a temperature model was proposed using historical temperature data and
airflow of a data center [71]. When it comes to the thermal behavior of disks, Kim et al. in-
vestigated the relationship among disk seek time, inter-seek time, and disk temperatures [65].
They also observed that the number and size of flatters in a disk affect its temperatures.
In enterprise-level Tera-data centers, a single node is capable of supporting more than 100
disks [88]. The temperature of these large number of disks within a data node plays a crucial
role in impacting the outlet temperatures of the node. However, there is lack of studies on
3
the impact of hard drives and solid state disks on outlet temperatures of storage nodes in a
cluster.
In addition, with the growth of data transmission through networks, the energy cost of
these transmission activities becomes another important issue in maintaining energy-efficient
data centers. For example, for the famous social network website, Facebook, the worldwide
monthly active users increase from 100 million in the third quarter of 2008 to 1.28 billion
in the first quarter of 2014 [106]. There are 72 million links shared, 300 million photos
uploaded, 2.5 billion status updated, and 2.7 billion likes and comments are made every
day [11]. For such a huge amount of data transmissions through the internet, the energy
consumption of these transferring activities are considerably high. If we could reduce the
energy consumption of these data transmissions, we would gain a large deduction in both
the computing and also the cooling cost of data centers.
1.2 Contributions
We introduce a modeling approach to build thermal models for estimating the outlet
temperature of a storage node and propose thermal-aware management strategies for data
storage systems. We make the following three contributions.
? First, we generate the thermal profile of a storage server. The profiling results are
obtained by running CPU-intensive and I/O-intensive workloads imposed by Whet-
stone[9] andPostmark[63], respectively. WhentheCPU/diskisrunning undervarious
workload scenarios, we monitor CPU/disk temperature as well as the inlet and outlet
temperatures of the data node. We study not only the thermal behavior of a hard disk
drive but also a solid state disk.
? Second, we build a thermal model to estimate outlet temperatures of data nodes using
inlet temperatures, CPU and disk workloads. This model can predict outlet tempera-
tures from CPU and disk utilizations.
4
? Third, weproposethermalmanagementstrategiestobuildenergy-efficientdatastorage
systems.
1.3 Organization
The rest of this dissertation is organized as follows. The next chapter presents prior
studies and related research works. In Chapter 3, preliminary temperature models are pro-
posed for estimating the out temperature of a data node when processors or disks are in idle
or fully utilized. In Chapter 4, experiments are conducted to study the thermal behaviors of
processorsanddisksundervariousutilizations, andathermalmodelingapproachispresented
for predicting the temperature of processors and disks by taking into account their utiliza-
tions. Furthermore, a outlet temperature model is developed for data nodes. These thermal
models are validated against real-world measurement acquired by temperature sensors.
In Chapter 5, a thermal management strategy is proposed for task scheduling in data
centers. Then, two data placement strategies are stated for homogeneous and hybrid storage
systems in Chapter 6. Chapter 7 presents a predictive thermal-aware management strategy
for data transmission. Finally, Chapter 8 concludes the dissertation, and points out the
future research.
5
Chapter 2
Related Work
Energy consumption of data centers has increased significantly in the past years. Nu-
merous methods are proposed to save energy cost for data centers. This chapter briefly
presents previous research in building energy-efficient data centers. Research of comput-
ing and cooling cost reduction is introduced, and thermal models that play critical roles in
predicting cooling cost are also studied. Features and previous studies of solid state disks
are investigated. A lot of thermal management strategies have been raised to save energy
by considering the thermal impacts on cooling cost of data centers. In addition, data com-
pression is another method to save energy by reducing the data set size and improving the
performance of data transmission.
2.1 Energy-efficient Data Centers
Tens of thousands of data centers around the world are consuming huge amount of en-
ergy. Increasing business companies, IT companies, and institutes are planning to build their
own data centers. A study by DatacenterDynamics demonstrates that worldwide investment
in data centers in 2012 had increased by 22.1% up to 105 billion dollars compared with 2011,
and this investment is going to grow by another 14.5% to 120 billion dollars for 2013 [62].
Researchshowstherapidincrementofenergyconsumptionofdatacenters[51][57][111].
A report announces that 1,500 TWh of electricity, which is nearly 10% of world electricity
generation, is used by the world?s Information-Communication-Technologies (ICT) ecosys-
tem annually [81]. Furthermore, global data centers are estimated to consume (as of 2010)
from 250 to 350 TWh every year. A reason behind the striking energy consumption in data
centers is the rapid growth of computing and storage capacity in recent years. For instance,
6
Facebook has invested more than 1 billion in IT facilities that power its social network, which
now serve more than 845 million users in a month around the world [80].
Cloud computing has become a popular topic in recent years. A study shows that coal
and nuclear, which generate severe air pollution, are used to satisfy these large amount of
electrical energy demand [41]. Apple, HP, IBM, Facebook, and Mircosoft are using dirty
energy to power their growing cloud data centers. Confronting with the rapid increment of
energyconsumptionandsevereairpollution,growingattentionhasbeenpaidtobuildenergy-
efficientdatacenters[12][43][53][69]. Atthesametime, smallormediumsizedorganizations
began to move their computing applications to an Internet-based "cloud" platform in order
to improve energy efficiency [114].
Computing cost and cooling cost are major components of total energy consumption for
data centers. Computing cost refers to the electronic energy cost that makes the IT facilities
working. And cooling cost is the cost of cooling systems that lower down the temperature in
data centers. Studies have been conducted on reducing either the computing cost or cooling
cost in order to build energy-efficient data centers.
2.1.1 Computing Cost
Alotofresearchhavebeendoneinreducingcomputingcostofdatacenters[14][100][119].
For instance, CMPs are widely used in data centers, and the frequency/voltage of CPU cores
could be adjusted in order to save power consumption. Mishra et al. proposed a two-tier
feedback-based control scheme, in which the first-tier is comprised of a global power manager
toallocatepowertargetstoindividualislandsaccordingtoworkloadsandthesecond-tiercon-
sists of local controllers that adjust island power through changing the voltage and frequency
as a response of workload requirements [82]. A power-efficient scheme for erasure-coded stor-
age clusters?ECS2?was proposed, which aims to offer high energy efficiency with marginal
reliability degradation [56].
7
Popular strategies to reduce computing cost include redistributing workload and pow-
ering off idle disks or data nodes. For example, an energy-efficient strategy was proposed
which specifies a subset of disks as cache disks and dispatches workloads to these cache
disks while making the other disks spin down [38]. Another strategy introduced a Popular
Data Concentration (PDC) technique that migrates frequently accessed data to a subset of
disks [89]. Then the other disks which are not accessed frequently could be transitioned to
low-power mode, and the total computing cost of these data nodes could be reduced.
Many researchers concentrate on resource management and task scheduling in data
centers to decrease computing energy consumption [13] [23] [24] [70] [112]. For instance, Bel-
oglazov and Buyya proposed an energy-efficient resource management system for virtualized
Cloud data centers [24]. In this system, VMs are consolidated according to the utilization
of resources, and virtual network topologies are built between VMs and thermal status of
computing nodes to save energy. This management system reduces the operational costs
of data centers and provides the required Quality of Service (QoS). Beloglazov et al. also
demonstrated an architectural framework (including resource provisioning and allocation
algorithms) and principles for energy-efficient Cloud computing [23]. Experimental results
show that their Cloud computing model has immense potential in energy saving and en-
ergy efficiency improvement under dynamic workload scenarios. In addition, Aksanli et al.
demonstrated an adaptive job scheduler that utilizes the prediction of solar and wind energy
production [13]. This job scheduler improves the energy efficiency by three times. Lee and
Zomaya pointed out that under-utilized resources account for a large amount of energy use
and resource allocation strategies could be applied to achieve high energy efficiency [70].
They proposed two task consolidation heuristics methods that aim to maximize resource
utilization and take into account of both active and idle energy consumption. Experimental
results illustrated the energy saving capability of their heuristics.
With the growing of data center density and size, designers should take into account
of both energy costs and carbon footprint. Altering the usage patterns of data centers is
8
believed to be a practical method to affect demand response. Chiu et al. pointed out that
shifting computational workloads across geographic regions to match electricity supply may
help balance the electric grid [37]. They proposed a symbiotic relationship between data
centers and grid operators and a low cost workload migration mechanism. Ren and He
proposed an online algorithm, called COCA (optimizing for COst minimization and CArbon
neutrality), for minimizing operational cost in data centers while satisfying carbon neutrality
without long-term future information [92]. COCA enables distributed server-level resource
management: each server autonomously adjusts its processing speed and optimally decides
the amount of workloads to process. Analysis of trace-based simulation studies show that
COCA reduces cost by more than 25% (compared to state of the art) while resulting in a
smaller carbon footprint.
Furthermore, network facilities are also investigated in order to reduce the energy con-
sumption of data centers. The architecture of a Data Center Network (DCN) affects its
scalability, however, its power consumption is a main contributor to its energy cost. Ham-
madi and Mhamdi classified existing DCNs as switch-centric and server-centric networks,
and conduct literature review of existing technologies in energy saving and renewable energy
approaches [55].
2.1.2 Cooling Cost
Cooling cost is an unignorable component of the total energy consumption for a data
center. Increasing studies are investigating strategies to save cooling cost [22] [76] [94].
Generally speaking, there are seven categories of strategies for saving cooling cost of data
centers [39]. Major strategies include managing airflow in data centers; locating cooling
systems as close as to IT equipments; using dynamic control to fit with the thermal load of
data centers; and maintaining a higher operating temperature.
A novel approach was proposed to model the energy flows in a data center and opti-
mize its operations [76]. Overall sustainability of data center operations could be improved
9
through a holistic approach. In this approach, predictions of renewable energy and IT de-
mands were conducted and an IT workload management plan was generated. This manage-
ment plan schedules IT workloads and allocates IT resources depending on cooling efficiency
and power supply. Experimental results show that this approach saves both recurring power
cost and the use of non-renewable energy.
However, constraints exist in optimizing energy consumption of data centers, such as
the threshold for income temperatures, the capacity and response time. To balance the
performance and the temperature constraint, a coupled thermal-performance model and a
cooling-aware workload placement strategy were proposed [94]. This thermal-performance
model leads to a power saving of 21% and the data placement strategy gains energy saving
of 8%.
Research demonstrated the efficiency of workload management strategies in reducing
outlet temperature of data nodes [83], minimizing heat recirculation [109], or decreasing
inlet temperature which leads to a reduction of cooling cost of data centers [110]. For
instance, a thermal-aware task scheduling algorithm, XInt, was proposed to minimize heat
recirculation by balancing the workloads within a homogeneous data center [110]. In this
work, researchers discovered that cooling costs highly depend on peak inlet temperatures.
In order to lower cooling power, they designed a task assignment policy, MPIT-TA, which
minimizes the peak inlet temperature through task assignment. Their simulation results
show that MPIT-TA saves at least 20% of cooling energy.
After analysing Energy Inefficiency Ratio of SPatial job scheduling (a.k.a. job place-
ment) algorithms, also referred as SP-EIR, a coordinated cooling-aware task placement and
coolingmanagementalgorithm, HighestThermostatSetting(HTS),wasdeveloped[22]. HTS
is aware of dynamic behavior of the Computer Room Air Conditioner (CRAC) units and
dispatches tasks to reduce the cooling demands from the CRACs. Dynamic updates of the
CRAC thermostat settings based on the cooling demands could decrease the total energy
consumption.
10
2.2 Thermal Modeling
Temperature is a major contributor to cooling cost, and studies have been conducted
to reduce heat generation or speed up heat dissipation. Heat sinks and heat pipes are in-
vestigated to promote heat dissipation. For instance, a heat sink model associated with one
of IEEE EMC challenging problems was used to study three different grounding configura-
tions[77]. AnewsimulationmodelforanIntelP4CPUheatsinkwasproposedandanalyzed.
Then, an optimal design of the CPU heat sink could be performed in order to minimize the
radiated emission from the CPU heat sink. Besides heat sinks, heat pipes are also applied to
transfer heat from hot to cold regions. Researchers present a time- and temperature-aware
methodology that uses additional heat pipes [50]. A thermal model was developed to sim-
ulate effects of metal interconnects on heat distribution. Results show that, by deploying
additional heat pipes, their methodology gains a 5% to 7% decrease in temperature variation
through-out and 2 to 3 degree reduction in hotspot temperature.
Besides deploying heat sinks and heat pipes, another solution is to reduce the heat
generation of data nodes. To characterize the thermal behaviour of data nodes, studying
the thermal impact of IT components on each data node is an important approach. There
have been extensive research investigating the thermal behaviors of CPU, disk, memory, and
network cards, and researchers find that these components make key contribution to the
outlet temperature of a data node.
2.2.1 CPU Models
CPU had been identified as a resource that greatly contributes to energy-consumption in
datacenters. Studiesanalyzedandmodeledthepowerconsumptionofprocessors[26][28][45].
Thermal impacts of processors are also widely studied. For example, HotSpot was proposed
to estimate the temperatures of CPUs, which could accurately and fast predict the temper-
atures of CPUs at the micro-architecture level [103]. This model is based on an equivalent
circuit of thermal resistances and capacitances that correspond to micro-architecture blocks
11
and essential aspects of the thermal package. In order to model the thermal behaviour of
different types of CPU, the micro-architecture need to be studied and sophisticated models
should be generated. Few work has been done to model CPU temperatures at coarse-grained
level.
2.2.2 Disk Models
There have been studies investigating thermal characteristics of disks. An early work
proposed a thermal model to predict the transient temperature of an IBM?s fixed disk
drive [46]. Another work introduced a three-dimensional transient temperature model, which
estimatesdisktemperaturesunderfrequentseekingoperations[107]. Acomprehensivemodel
which takes into account of five components (internal drive air, spindle motor, the base and
cover of the disk, the voice-coil motor, and disk arms) of a hard drive disk was demonstrated
to predict disk temperature [54]. Researchers also studied the impacts of seek time and
inter-seek time on disk temperature [65], and they found that either increasing the inter-seek
time or decreasing the seek time could decrease the disk temperature.
Previous disk temperature models took into account of heat dispatching and disk activ-
ities at a fine-grained perspective. Detailed specifications are need to model the temperature
of a new disk. The disk temperature under particular workload cannot be estimated by
simply using previous modeling approaches. To address this problem, we conducted studies
on disk temperatures by considering the thermal impacts of disk utilizations on disk temper-
atures, and proposed thermal models for both hard drive disks and solid state disks [58] [59].
2.2.3 Memory Models
Approaches were also proposed to coordinate processors and memory to improve sys-
tem performance and/or power efficiency during memory thermal emergency [73] [74]. An
adaptive core gating (DTM-ACG) and coordinated DVFS (DTM-CDVFS) schemes as well
as a thermal model were designed to predict DRAM temperatures [74]. Experimental results
12
demonstrate that these two schemes exhibit 6.7% and 15.3% of improvements in terms of
performance. DTM-CDVFS also reduces the processor power rate by 15.5% and system
(including processor and memory) energy by 22.7%. Besides that, a DRAM thermal model
was proposed and validated with measurement on an instrumented server platform. Experi-
mental results illustrate that their model reflects the dynamic DRAM temperature changes;
the average temperature difference between estimated and measured values is less than 1 uni2103.
2.3 Solid State Disk (SSD)
Solid state disk (SSD) is an emerging storage technology with high I/O performance and
energy efficiency. There is a high potential to widely apply SSDs in large-scale cluster storage
systems. SSDs are more expensive than traditional hard drives [85], but they perform better
than traditional hard drive disks in random reads and writes [44] [75] [78]. Meanwhile, with
the increasing density of flash-based SSDs, reliability, endurance, and performance are all
declining [52]. Growing studies are conducted to improve the reliability and performance of
SSDs [31] [61] [102].
To improve both performance and energy efficiency, hybrid SSD devices may be em-
ployed to build large storage systems. Recently, Chang proposed an SSD-based hybrid
storage system that combines MLC flash-based and SLC flash-based SSDs [30]. Their exper-
imental results demonstrate that compared with MLC-flash-based SSD storage, the hybrid
system can gain significant improvements in terms of throughput and energy savings. Oh
et al. proposed a cost-effective and reliable SSD host cache solution?SRC (SSD RAID
Cache) [86]. In this solution, cost-effectiveness is ensured by using multiple low-cost SSDs
and reliability is enhanced by RAID-based data redundancy.
Apart from hybrid SSDs, hybrid storage systems that combine HDDs and SSDs have
also been proposed to make a good trade-off between performance and cost. For example,
Chen et al. designed a hybrid storage system ? Hystor ? in which hot data is stored in SSDs
to optimize system I/O performance [35]. All data accesses are periodically recorded and
13
analyzed by a monitor module. When any data becomes hot, it will be moved to a SSD to
reduce data access time. Wu et al. developed a hybrid page/block architecture along with an
advanced replacement policy called BPAC to exploit both temporal and spatial locality [115].
Mao et al. proposed a hybrid parity-based disk array architecture (HPDA), where SSDs and
HDDs are integrated in a RAID system to improve the performance and reliability of the
RAID[79]. Balakrishnanet al. proposedDiff-RAID,aparity-basedredundancysolutionthat
unevenly distributes and balances the parity across SSDs to improve the reliability of storage
systems [21]. Schall et al. investigated the performance and energy efficiency of SSDs and
HDDs in I/O-intensive database applications [97]. Although hybrid storage systems can offer
good performance and reliability, less attention has been paid to the thermal characteristics
of hybrid storage devices that have significant impacts on the energy costs of cooling systems
in future data centers.
2.4 Thermal Management
Improvingenergyefficiencybecomesincreasinglyimportantfordatacenters. Techniques
or strategies reducing energy cost of cooling systems make a major contribution to advance
energy-efficient data centers. Growing research focuses on thermal management to build
energy-efficient data centers [47] [64] [87] [99] [118]. Thermal-aware resource management
strategies are proposed for balancing temperature distribution in data centers in order to
save energy consumption.
2.4.1 CPU Thermal Management
A handful of temperature-aware load balancing strategies which considering the thermal
impacts of processors were proposed [67] [95]. For instance, a customized threshold is set
to limit CPU temperatures [96]. If the CPU temperatures exceed the threshold, the CPU?s
voltage and frequency will be dynamically adjusted to conserve CPU energy consumption at
the cost of increasing execution time. Sharma et al. demonstrated a thermal-load-balancing
14
framework to dynamically distribute workloads across data nodes in a data center [101].
Their simulation results show that equipment reliability can be improved by placing an
asymmetric workload and uniformly distributing temperature in data centers.
Ayoub et al. stated a multi-tier approach for significantly reducing the cooling costs
associated with fan subsystems without compromising the system performance [17]. Fan
speed is managed by intelligently allocating the workload at the core level as well as at the
CPU socket level. At the core level, a proactive dynamic thermal management scheme is
proposed and a new predictor is also introduced to utilize the band-limited property of the
temperature frequency spectrum.
2.4.2 Memory Thermal Management
Energy consumption and thermal behaviour of memory are also investigated. A joint
energy, thermal and cooling management technique (JETC) was proposed to reduce the
cooling and memory energy cost of each server [18]. JETC takes into account of thermal and
power states of CPU and memory, thermal coupling between CPU, memory and fan speed
to make energy efficient decisions. CPU and memory actuators are used to make decisions.
The memory actuator decreases the energy cost of memory by performing cooling aware
clustering of memory pages to a subset of memory modules. The CPU actuator reduces
cooling energy by lowering down the hot spots between and within the CPU sockets and
minimizing the effects of thermal coupling. Their experimental results show that employing
JETC leads to 50.7% average energy reduction in cooling and memory subsystems with less
than 0.3% performance overhead.
A Coordinated Management of Energy, Thermal, and Cooling (CoMETC) technique
was proposed to minimize cooling and memory energy of server machines [19]. State-of-the-
art solutions decouple the optimization of cooling costs and energy consumption of CPU and
memory subsystems. This leads to suboptimal solutions because of thermal differences be-
tweenCPUandmemoryandthenon-linearityinenergycostsofcooling. CoMETCdecreases
15
the memory operational energy by clustering active memory pages to a subset of memory
modules while accounting for thermal and cooling aspects. Simultaneously, CoMETC re-
moves hotspots between and within the CPU sockets and reduces the impacts of thermal
coupling with memory.
2.4.3 Storage Thermal Management
Energy consumption and power management of storage systems are widely investi-
gated [27] [72] [93] [98]. After studying write policies [66], cache and prefetching techniques
are proposed to save energy consumption for disks. For instance, Song et al. proposed a
data prefetching scheme, in which the amount of data prefetched for each video stream is
dynamically adjusted for the bit-rates of streams and the power characteristics of different
disks [104].
As is known that the biggest power consumer in data centers is the storage system. Disk
drives are lowly utilized and there is large space for savings power consumption of disks. A
methodology that quantitatively estimates the performance impact for power savings was
proposed [117]. By taking into consideration the effects of propagation delay, the correctness
andefficiencyoftheproposedanalyticalmethodologywasverifiedintheirexperimentsdriven
by production server trances.
A large fraction of the power budget in data centers is consumed by storage systems.
Enterprise storage systems are not widely deployed with power-saving solutions. The tradi-
tional way that spins down disks is ineffective because idle periods are too short for industry
workloads. By analyzing block-level traces from 36 volumes in an enterprise data center
for one week, Narayanan et al. made conclusions that significant idle periods exist and
can be further increased by modifying the read/write patterns using write off-loading [84].
Write off-loading allows write requests on spun-down disks to be temporarily redirected to
other persistent storage in the data center. Experimental results show that spinning down
16
disks when they are in idle state could save 28-36% of energy, while write off-loading further
increases the savings to 45-60%.
Disk power consumption could be saved by turning a disk drive into a low power mode
duringidletimes. Problemexistthatfuturejobarrivalsisunaware, thusfuturediskactivities
could not be predicted. By exploring ranges and trade offs of possible power savings and
performance within a set of enterprise storage traces, Riska and Smirni demonstrated the
difficulty of obtaining significant power savings in traces where overall utilization is less
than 5% and explored the feasibility of popular schemes such as workload shaping for power
savings [93]. They proposed a proactive autonomic algorithm that provides suggestion on
when and for how long a power savings mode should be activated by given an acceptable
performance degradation target. Their experimental results show the robustness of the
algorithm.
Bostoen et al. studied alternative methods that reduce disk access time, conserve space,
or exploit energy-efficient storage hardware in dynamic power management [27]. Previous
energy-conservation techniques do not consider the fundamental trade-offs between power,
capacity, performance, and dependability. They stimulated an integration of different power-
reduction techniques in new energy-efficient file and storage systems.
However, previous load balance strategies have not fully considered disks as an im-
portant thermal impact to the outlet temperature. In this dissertation, we will study the
thermal impacts of disks and propose thermal-aware management strategies to save energy
consumption, especially cooling cost.
2.4.4 NoC Thermal Management
Nowadays, three-dimensional network-on-chip (3D NoC), which integrates NoC and
die-stacking 3D IC technology, achieves lower latency, higher network bandwidth, and lower
power consumption. However, with the increment of dies stack vertically, the raise of length
ofheatconductionpathandpowerdensityperunitareacannotbeignored. Chaoet al. found
17
thatroutersofNoChavecomparablethermalimpactasprocessors[32]. Theirresearchshows
that NoC contributes significantly to overall chip temperature. They proposed a traffic- and
thermal-aware run-time thermal management (RTM) scheme, which ensures both thermal
safety and less negative performance impact from temperature regulation. Based on their
simulation experiments, the RTM scheme is effective and can be combined with thermal-
aware mapping techniques to achieve higher run-time thermal safety.
Though three-dimensional Network-on-Chip (3D NoC) has been proposed to solve com-
plex on-chip communication issues, however, thermal problem become another big issue
because of the larger power density and the heterogeneous thermal conductance in different
silicon layer of 3D NoC [36]. When a device is thermal-emergent, Dynamic Thermal Manage-
ment (DTM) techniques will be triggered. However, these reactive DTM schemes result in
significant system performance degradation. Thus, they proposed a temperature prediction
model and a proactive DTM with vertical throttling (PDTM-VT) scheme, which is managed
by the distributed Thermal Management Unit (TMU) on each NoC node. Based on their
prediction temperature model, the TMU can manipulate devices to avoid thermal-emergent.
According to their experimetal results, the prediction error of the proposed temperature
prediction model is less than 0.25% compared with real measurement within 50ms. Fur-
thermore, a 11.84% - 23.18% reduction of thermal-emergent nodes and a 0.47% - 47.90%
improvement of network throughput can be observed when PDTM-VT is used.
2.4.5 Predictive Thermal Management
Besides traditional dynamic thermal management techniques which making actions af-
ter emergency, predictive thermal management strategies have also been studied [48] [91].
A performance-effective Dynamic Thermal Management (DTM) system for multimedia ap-
plications was demonstrated to reduce energy consumption [105]. In this study, a predictive
DTM algorithm was developed to efficiently use response mechanisms. The experimental re-
sults show that the DTM algorithm performs significantly better than existing reactive DTM
18
algorithms. Another group of researchers built a software structure for Internet services (C-
Oracle) [91]. In this study, the system chooses the best reaction by predicting and evaluating
temperature and performance impacts of various thermal management reactions. C-Oracle
effectively deals with thermal emergencies without unnecessary performance degradation. In
addition, an energy-saving framework that provides energy estimation before data is trans-
mitted was proposed [60]. Experimental results show that this frame work would choose the
most energy-efficient data transmission strategy from given candidate strategies by using
related runtime information.
2.5 Data Compression
Datacompressiontechniqueshavebeenwidelyappliedtoachievehighspaceefficiencyin
storage systems and shorten data retrieval time [68] [15]. Compression techniques are able to
reduce data sizes; however, existing compression techniques introduce extra CPU overhead.
In addition, compression ratios of a particular method may vary greatly for different file
types.
Cannane and Williams proposed a semi-static phrase-based scheme called XRAY [29].
An o?ine model was first built by training samples selected from data collection. Then,
the entire collection can be compressed online in a single pass. The experimental results
illustrate that their method performs well for large general-purpose collection compression,
especially in the case when an individual record or document is required to be decompressed.
Reetuparna et al. explored the performance and energy behaviours of data compression
on Network-on-Chip (NoC) [42]. Two configurations examined in their study include Cache
Compression (CC) and Compression in the Network Interface Controller (NIC). Decompres-
sion latency can be hidden by overlapping with NoC communication latency. The simulation
results show that the compression-on-NoC method achieves energy savings by 20%.
19
2.6 Summary
One objective of this dissertation is to propose thermal-aware management strategies
to save energy cost of data centers. To reduce energy consumption, efforts were placed on
improving performance or decreasing temperatures in data centers. In the first section, we
introduced main methods in building energy-efficient data centers. In the second section,
thermal models of components in data nodes were investigated. We observed that previous
thermal models predict disk temperature at a fine-grained level. If detailed specifications
and properties of a disk are not available, it is impossible to model the disk temperature.
In addition, solid state disks have become increasingly popular in data storage. In the third
section, we presented related work on solid state disks. Then, previous thermal management
strategies were stated in the forth section. Finally, data compression methods are discussed
in the fifth section.
20
Chapter 3
Preliminary Thermal Models
There have been a lot of studies on constructing thermal models for data centers. Some
generate models to estimate thermal behaviours of CPUs, disks, memories, and network
cards. Others model the outlet temperature of data nodes by taking into account of air
recirculation in data centers. However, thermal behaviour of disks and their impacts on data
nodes have not been fully explored.
In this chapter, we generate the thermal profile of a storage server containing three hard
disks. The profiling results show that disks have comparable thermal impacts as processing
and networking elements to overall storage node temperature. Then, we develop a thermal
model to estimate the outlet temperature of a storage server based on processor and disk
utilizations.
Therestofthischapterisorganizedasfollows. InSection3.1, agroupofexperimentsare
presented for evaluating the thermal impact of both CPU and disks on outlet temperatures.
In Section 3.2, we propose a thermal model for predicting the outlet temperature under four
types of workloads: combinations of CPU and disks are either idle or fully utilized. The
thermal model is validated against data acquired by an infrared thermometer as well as
build-in temperature sensors on disks. Then, case studies of applying the thermal model to
analyse real problems is presented in Section 3.3. Finally, Section 3.4 concludes this chapter
by summarizing the main contributions of the chapter.
3.1 Thermal Impacts of Disk I/O
To characterize the impacts of CPU and disks on the inlet/outlet temperatures of a data
node, we conduct a number of experiments on a Linux server. In these experiments, CPU
21
temperatures are detected by software lm-sensors [3] and disk temperatures are collected
by software hddtemp [1]. The inlet and outlet temperature are acquired by an infrared
thermometer.
3.1.1 Testbed
The testbed used in these experiments is equipped with four Intel(R) Xeon 2.4 GHz
CPU, 2.0 GBytes RAM, and three 160 GBytes SATA disks deployed in a disk array. The
configuration parameters are summarized in Table 3.1.
Table 3.1 Testbed Configuration
Hardware Software
4 Intel(R) Xeon 2.4 GHz CPU X3430 Ubuntu 10.04
1 2.0 GBytes of RAM Linux kernel 2.6.32
3 WD 160 GBytes Sata disk
(WD1600AAJS-75M0A0 [7])
3.1.2 Impact of CPU and Disks on Inlet/Outlet Temperatures
Outlet temperatures of a node are determined by various factors, including CPU and
disk temperatures, mother-board temperatures, and inlet temperatures. The CPU factor
has been addressed in prior studies (see, for example, [110] [95] [96]). Unfortunately, the
thermal impact of disk I/O on data nodes remains an open issue.
Table 3.2 Experiment Configuration
Experiments Utilization(%) Power (W)CPU Disk
1 0 0 73
2 100 0 135
3 0 100 85
4 100 100 142
22
To investigate the relationship between CPU/disks and the inlet/outlet temperatures,
we conduct four experiments, in which a combination of high (100%) and low (0%) utiliza-
tions of CPU and disks are considered. The configuration details are shown in Table 3.2. In
these experiments, CPU and I/O workloads are generated by stress [5] and postmark [63],
respectively. The power consumption of the testbed is measured by a power meter. The
temperatures of the four cores and three disks in the testbed are presented in the rest of this
section.
Low CPU and Low Disk Utilization
In the first experiment, we place both CPU and disks in the idle mode. Fig. 3.1 shows
that disk and CPU temperatures keep the same. The node?s inlet temperature varies slowly
from 24.8 uni2103 to 30.6 uni2103, which leads the outlet temperature to vary accordingly. When the
outlet temperature goes up, the inlet temperature also increase due to heat recirculation.
On average, the difference between the inlet and outlet temperatures is 3.8654 uni2103, ranging
anywhere between 3.2 uni2103 and 5.0 uni2103. In this case, the discrepancy between inlet and outlet
temperatures can be expressed as a constant. Thus, we have:
Tdiff1(t) = 3:8654 (3.1)
High CPU and Low Disk Utilizations
Inthesecondexperiment,wekeepCPUextremelybusy(i.e.,CPUutilizationapproaches
to 100%) while placing disks in the idle mode. Fig. 3.2 shows that the CPU temperature
goes up fast; it increases 20 uni2103 in 4 minutes. On the other hand, the disk temperatures do
not change much. The difference between the inlet and outlet temperatures increases slowly
from 4.6 uni2103 to 6.6 uni2103 in the first 600 seconds, and then maintain at a constant value in
the next 1200 seconds. We denote inlet and outlet temperature difference as Tdiff2, where t
refers to the time at which the data node has run under 100% CPU and 0% disk utilizations.
23
0 200 400 600 800 100012001400160024
26
28
30
32
34
36
38
Time(\sec)
Temperature(
?C)
Inlet and outlet temperature
 
 Tin
Tout
0 200 400 600 800 10001200140016003
3.5
4
4.5
5
5.5
6
6.5
7
7.5 Inlet/outlet temperature difference
Time(\sec)
Temperature(
?C)
 
 
0 200 400 600 800 100012001400160030
35
40
45
50
55
60
Time(\sec)
Temperature(
?C)
Temperature of CPU
 
 Tcore0
Tcore1Tcore2
Tcore3
0 200 400 600 800 100012001400160033
33.5
34
34.5
35
35.5
36 Temperature of disks
Time(\sec)
Temperature(
?C)
 
 Tdisk1
Tdisk2Tdisk3
Figure 3.1: Temperature evaluation under the low CPU and low disk utilizations.
24
0 200 400 600 80010001200140016001800200024
26
28
30
32
34
36
38
Time(\sec)
Temperature(
?C)
Inlet and outlet temperature
 
 Tin
Tout
0 200 400 600 8001000120014001600180020003
3.5
4
4.5
5
5.5
6
6.5
7
7.5 Inlet/outlet temperature difference
Time(\sec)
Temperature(
?C)
0 200 400 600 80010001200140016001800200030
35
40
45
50
55
60
Time(\sec)
Temperature(
?C)
Temperature of CPU
 
 
Tcore0Tcore1
Tcore2Tcore3
0 200 400 600 80010001200140016001800200033
33.5
34
34.5
35
35.5
36 Temperature of disks
Time(\sec)
Temperature(
?C)
 
 Tdisk1
Tdisk2Tdisk3
Figure 3.2: Temperature evaluation under the high CPU and low disk utilizations.
25
Thus, we have:
Tdiff2(t) =
8
><
>:
0:0023 t+ 4:8818; if t 600
6:2692; if t> 600
(3.2)
Low CPU and High Disk Utilizations
In the third experiment, we keep a low CPU utilization while increasing disk utilization
up to approximately 100%. We run three tasks, each of which imposes I/O-intensive load
on the disk. We observe from Fig. 3.3 that CPU temperature frequently fluctuates between
31 uni2103 and 35 uni2103 , because the three I/O-intensive tasks require the CPU resource to issues
I/O requests. Nevertheless, the CPU utilization remains fairly low. After completing the
tasks, CPU returns to the idle status and its temperature decreases to the normal value. In
this case, the thermal impact of CPU is negligible. In contrast, disk temperatures slowly
increase at the rate of around 2 uni2103 per 1000 seconds. The difference between inlet and outlet
temperature can be expressed by (3.3).
Tdiff3(t) =
8>
<
>:
0:0001 t+ 4:6086; if t 1000
4:7086; if t> 1000
(3.3)
High CPU and High Disk Utilization
In the final experiment, we push both CPU and disks utilizations up to 100%. We
observe that the CPU temperature increases 20 uni2103 at the beginning and goes back to the
original value after 1500 seconds when CPU-intensive tasks are completed. Therefore, we
focus on the data collected before 1500 seconds. The inlet and outlet temperature difference
falls in the range from 4.3 uni2103 to 7.5 uni2103 . In the first 660 seconds, the temperature difference
increase very fast and then do not fluctuate much. Thus, we conclude from the experiment
that CPU and disks significantly affect outlet temperatures, and the discrepancy between
26
0 500 1000 1500 2000 2500 3000 350024
26
28
30
32
34
36
38
Time(\sec)
Temperature(
?C)
Inlet and outlet temperature
 
 Tin
Tout
0 500 1000 1500 2000 2500 3000 35003
3.5
4
4.5
5
5.5
6
6.5
7
7.5 Inlet/outlet temperature difference
Time(\sec)
Temperature(
?C)
0 500 1000 1500 2000 2500 3000 350030
35
40
45
50
55
60
Time(\sec)
Temperature(
?C)
Temperature of CPU
 
 Tcore0
Tcore1Tcore2
Tcore3
0 500 1000 1500 2000 2500 3000 350034
34.5
35
35.5
36
36.5
37 Temperature of disks
Time(\sec)
Temperature(
?C)
 
 Tdisk1
Tdisk2Tdisk3
Figure 3.3: Temperature evaluation under the low CPU and high disk utilizations.
27
0 500 1000 1500 2000 2500 300024
26
28
30
32
34
36
38
Time(\sec)
Temperature(
?C)
Inlet and outlet temperature
 
 Tin
Tout
0 500 1000 1500 2000 2500 30003
3.5
4
4.5
5
5.5
6
6.5
7
7.5 Inlet/outlet temperature difference
Time(\sec)
Temperature(
?C)
0 200 400 600 800 1000 1200 140035
40
45
50
55
60 Temperature of CPU
Time(\sec)
Temperature(
?C)
 
 
Tcore0Tcore1
Tcore2Tcore3
0 200 400 600 80010001200140016001800200034
34.5
35
35.5
36
36.5
37 Temperature of disks
Time(\sec)
Temperature(
?C)
 
 
Tdisk1Tdisk2
Tdisk3
Figure 3.4: Temperature evaluation under the high CPU and high disk utilizations.
28
inlet and outlet temperature can expressed as (3.4).
Tdiff4(t) =
8
><
>:
0:0014 t+ 5:3720; if t 660
6:8923; if t> 660
(3.4)
Fig. 3.4 also shows that the average cold-start time for the three disks is more than
1200 seconds, much larger than the cold-start time of CPU (i.e., CPU cold-start time is 100
seconds).
3.2 Thermal Models
It is extremely challenging to model the energy consumption relationship between com-
puting and cooling systems. The cooling cost depends not only on cooling setting (e.g., inlet
temperatures and cooling equipment placement), but also on heat dissipated by computing
facilities. CPU and disks are two major types of components and heat contributors in data
nodes. In this section, we develop a thermal model that aim to estimate outlet temperatures
by considering the impacts of CPU and disks. Moreover, by combining a coefficient of perfor-
mance (COP, for short) model that predicts cooling costs by CRAC supply temperature [83],
our model can be used to predict the impact of CPUs and disks on cooling cost.
3.2.1 Framework
Tasks Task
Management
Inlet/Outlet 
Temperature 
Model
Outlet 
Temperature COP Cooling Costs
Thermal Model
Inlet 
Temperature
Figure 3.5: Framework of proposed solution.
Fig. 3.5 displays our thermal-modeling framework, which consists of two components,
namely, inlet/outlet-temperature model and COP model. The inlet/outlet-temperature
model builds up the relationship between inlet and outlet temperatures by profiling analy-
sis. In addition, given an outlet temperature, our model estimates inlet temperatures under
29
certain workloads. The COP model computes cooling costs by taking into account inlet
temperatures offered by the inlet/outlet-temperature model. The main contributions of this
framework are: (1) a thermal model that characterizes the relationship between inlet and
outlet temperatures of a data node and (2) cooling cost estimation for data center designers.
3.2.2 An Inlet/Outlet Temperature Model
Considering CPU and disk utilizations, we classify workloads of a node into four basic
types (i.e., see Section 3.1.2 for a combination of high and low utilizations of CPU and disks).
During any time period, the workload of a node can be decomposed into a number of sub-
period, in which the node runs under one of the four basic types. Thus, in each sub-period,
the discrepancy between inlet and outlet temperatures is modeled by incorporating the four
basic workload types.
Tdiff(t) =
8
>>>
>>>
><
>>>>
>>>
:
Tdiff1(t); if UCPU = 0 ;Udisk = 0
Tdiff2(t); if UCPU = 100;Udisk = 0
Tdiff3(t); if UCPU = 0 ;Udisk = 100
Tdiff4(t); if UCPU = 100;Udisk = 100
(3.5)
Givenworkloadsandanumberofsub-periodT =ft1;:::tng, wederivetheoutlettemperature
from (3.1)-(3.4) as:
Tdiff(T) =
Pn
i=1Tdiff(ti)
jTj (3.6)
3.2.3 The COP Model
The energy cost of a node is contributed by the energy consumption of the node and the
cooling cost. We use COP (i.e., the Coefficient Of Performance model), described in [83], to
calculate the cooling cost.
30
10 15 20 25 30 35 400
2
4
6
8
10
12
CRAC Supply Temperature(?C)
Coefficient of Performance(Heat Removed)  
 
COP=(0.0068*T2+0.0008*T+0.458)
Figure 3.6: Coefficient of the performance curve for the chilled-water CRAC units at the HP
Labs Utility Data Center [83].
Fig. 3.6 plots COP values that increase with the supply temperature of CRAC. A large
COP value indicates a high energy efficiency.
COP(T) = 0:0068 T2 + 0:0008 T + 0:458 (3.7)
In 3.7, COP is defined as the ratio of heat removed to the energy cost of the cooling
system for heat removal. T refers to the supply temperature of CRAC. The cooling power
PAC can be derived from COP using (3.8).
PAC = PCCOP(T); (3.8)
where PC is the computing energy power.
3.3 Case Studies
In order to demonstrate the application of our thermal model, we conduct three case
studies, representing three typical access patterns of applications. We use the same testbed
31
(see Section 3.1) to perform the case studies. We keep all the three disks busy in the high-
disk utilization cases. Let us consider the following access patterns (see Fig. 3.7) in our case
studies:
? Pattern 1: In theComputingAfterReading pattern, applications first load data from
disks, then process the loaded data using CPU resources.
? Pattern 2: In the Computing Then Writing pattern, applications perform CPU-
intensive computation first, followed by write-intensive activities to output data to
disks.
? Pattern 3: In the Computing andReading=Writing inParallel pattern, applications
concurrently impose both CPU-intensive and I/O-intensive load to the node.
Time
Disk Utilization
0
100
Time
CPU Utilization
0
100
Time
Disk Utilization
0
100
Time
CPU Utilization
0
100
Time
Disk Utilization
0
100
Time
CPU Utilization
0
100
Computing After Reading Computing Then Writing Computing and Reading/Writing in Parallel
Figure 3.7: Three typical access patterns.
Since the cold-start phase of disks is longer than that of CPU, we consider two scenarios
in each case study. The first scenario represents cases where that the execution time of I/O
tasks is smaller than the cold start phase. In this scenario, the cold-start issue significantly
affects outlet temperatures. The second scenario represents case where the execution time
of I/O tasks is much longer than the cold-start time. In the second scenario, the cold-start
issue becomes negligible. In the case studies, PC is the node?s power consumption.
32
Impact of the Cold-Start Phase
We set the execution time of both CPU- and I/O-intensive tasks to 10 minutes, which
is smaller than the cold start phase of disks. During the period of 10 minutes, the difference
between inlet and outlet temperatures under the four basic workload types are:
Tdiff1(600) = 3:8654 (uni2103)
Tdiff2(600) = 6:2618 (uni2103)
Tdiff3(600) = 4:6686 (uni2103)
Tdiff4(600) = 6:2120 (uni2103)
After processing the CPU- and I/O-intensive tasks for 20 minutes in each case study, we
evaluate the differences between inlet and outlet temperatures as follows.
Access Pattern 1. Disks are kept in the busy status in the first phase; Tdiff3(600)
denotes the inlet/outlet-temperature difference. The increase of difference between inlet and
outlet temperatures is Tdiff3(600)-Tdiff1(0), which is 0.8032 uni2103. Since the cold-start time
for disks are longer than 10 minutes, the disk temperature remains unchanged in the second
phase. In this case, if the increase of the inlet/outlet-temperature difference in the first phase
is considered as the increase in the inlet temperature for the second phase, and then this
increment should be accumulated to the second phase. Therefore, the overall inlet/outlet-
temperature difference can be derived as:
Tpattern1(1200)
= Tdiff3(600) +Tdiff3(600) Tdiff1(0) +Tdiff2(600)2
= 5:8668 (uni2103)
Access Pattern 2. We obtain an average difference between inlet and outlet temper-
atures (i.e., 5.4652 uni2103 ) after running the test for 20 minutes. Tdiff2(600) is the temperature
increment in the first phase, in which CPU is busy. Then, in the second phase, the CPU
33
temperature falls down to the normal value in the first 10 seconds; the CPU temperatures in
the second phase can be considered as a constant. The inlet/outlet temperature difference
in the second phase can be calculated by Tdiff3(600). The average difference of inlet/outlet
temperature is described below:
Tpattern2(1200) = Tdiff2(600) +Tdiff3(600)2
= 5:4652 (uni2103)
AccessPattern3. theinletandoutlettemperaturedifferenceincreasesfrom3.8654uni2103 to
6.2120 uni2103 in the first phase. In the second phase, the CPU temperature drops down quickly;
whereas the disk temperature slowly decreases. The increasing and decreasing rates of disk
temperature are slow; no difference is observed in a 10-minute period. Hence, we use Tdiff4
and Tdiff1 to calculate Tpattern3(1200) as:
Tpattern3(1200) = Tdiff4(600) +Tdiff1(600)2
= 5:0387 (uni2103)
Theoretically, cooling costs under these three patterns can be reflected by the inlet-
outlet-temperature difference. To precisely evaluate cooling costs, we use the COP model
that takes inlet temperatures as an input and produces cooling energy consumption. The
inlet temperatures in the case studies are calculated in the way that identical outlet temper-
atures will be produced after the CPU- and I/O-intensive tasks are executed. For example,
the inlet temperatures under the aforementioned access patterns are 24.1 uni2103, 24.5 uni2103 and
25.0 uni2103 with outlet temperature being 30 uni2103.
34
According to the COP model, the COP values of these access patterns are:
COPpattern1 = COP (24:1) = 4:4268
COPpattern2 = COP (24:5) = 4:5593
COPpattern3 = COP (25:0) = 4:728
Given power (see Table Table 3.2) of the node, we derive the energy dissipation as:
PPOWER1 = 135 600 + 85 600 = 132;000(J)
PPOWER2 = 135 600 + 85 600 = 132;000(J)
PPOWER3 = 142 600 + 73 600 = 129;000(J)
The cooling costs calculated by the COP model are:
PAC1 = PPOWER1COP
pattern1
= 29;818(J)
PAC2 = PPOWER2COP
pattern2
= 28;952(J)
PAC3 = PPOWER3COP
pattern3
= 27;284(J)
From the above analysis, access patten 3 saves the cooling cost of patterns 1 and 2 by
2,534 J and 1,668 J, respectively. The total energy cost, including computing and cooling
energy consumption, are shown below:
PTOTAL1 = PPOWER1 +PAC1 = 161;818(J)
PTOTAL2 = PPOWER2 +PAC2 = 160;952(J)
PTOTAL3 = PPOWER3 +PAC3 = 156;284(J)
We observe that access pattern 3 leads to the lowest energy. Pattern 3 makes it possible
to increase CRAC temperature to lower cooling cost. This observation motivates us to
35
proposeathermal-awareworkloadmanagementthatminimizesthetotalenergyconsumption
by data placement optimization (see Chapter 6).
To validate theaccuracyofthe model, we manually measuretheinlet andoutlettemper-
atures of the node by using an infrared thermometer. We collect 20 temperature samples in
each case study. We compare inlet-outlet-temperature differences obtained from our model
against the real-world measurement. Table 3.3 shows that the precision-errors of our model
for the three case studies are 2.28 %, 3.74%, and 4.84%, respectively. The precision is cal-
culated by dividing an average difference between real measurement and simulation results
by real measurement.
Table 3.3 Thermal Model Validation
Case Study 1 Case Study 2 Case Study 3
Precision Error (%) 2.28 3.74 4.84
Negligible Cold-Start Phase is Insignificant
if the execution time of CPU- and I/O-intensive tasks are sufficiently long, impact of
the cold-start phase becomes negligible. Now, we extend the model to consider cases where
the cold-start phase can be ignored. We set the execution time of the tasks to be 60 minutes
(totally 120 minutes), Tdiff of the basic workload types are given below:
Tdiff1(3600) = 3:8654(uni2103)
Tdiff2(3600) = 6:2692(uni2103)
Tdiff3(3600) = 4:7086(uni2103)
Tdiff4(3600) = 6:8923(uni2103)
The average inlet-outlet-temperature differences under the three access patterns are:
36
Tpattern1(7200) = 5:4889(uni2103)
Tpattern2(7200) = 5:4889(uni2103)
Tpattern3(7200) = 5:3789(uni2103)
We can obtain the total energy costs of these cases as:
PTOTAL1 = 1;610;000(J)
PTOTAL2 = 1;610;000(J)
PTOTAL3 = 1;570;000(J)
The results show that compared with patterns 1 and 2, pattern 3 offer 40,000 J savings in
energy.
3.4 Summary
Energy efficiency and thermal management of storage systems must be urgently ad-
dressed, because energy consumption and cooling costs of large-scale storage systems in data
centers have been increasing in the past decade. Recent studies show that cooling costs con-
tribute a significant portion of the operational cost of data centers. Thermal management
techniques have been applied to reduce the energy consumption in cooling systems, thereby
significantly improving the energy efficiency of data centers. Thermal models play a key role
in thermal management; however, traditional thermal models for data centers do not take
into account disk utilizations. In this chapter, we developed a thermal model to investigate
thermal impacts of hard disks on storage systems. We showed how to apply the thermal
model to estimate the outlet temperature of a storage server based on processor and disk
utilizations.
The proposed thermal model offers the following two benefits. First, the model makes
it possible to reduce thermal monitoring cost. Thermal management of hard disks in storage
systems helps to cut cooling cost and boost system reliability. Monitoring temperatures
37
is a key issue in thermal management techniques; however, it is prohibitively expensive to
acquire and set up a huge number of sensors in a large-scale data center. Our model is
an alternative to monitoring temperatures of storage systems. Second, our thermal model
enables data center designers to make intelligent decisions on thermal management during
the design phase.
38
Chapter 4
Advanced Thermal Models
In the previous chapter, we have learn that disk has comparable impact as processor
on outlet temperature. In the preliminary experiments, inlet and outlet temperatures of
the data node are detected by using an infrared thermometer. By monitoring the disk
temperature with its inner temperature sensor, we observe that disk temperatures increase
only1-2uni2103whendisksarefullyused. Ifthediskisnotheavilyloaded, nosignificantdifference
appears compared with the disk stay in idle state.
In this chapter, to achieve higher accuracy of accuracy of temperature models, we de-
ploy external temperature sensors [6] to monitor temperature and collect the temperature
data with MiniGoose [4]. We conduct several groups of experiments to study the thermal
behaviour of disks and the CPU under various utilizations. Furthermore, we also investigate
their impacts on the temperature of the data node.
This chapter is organized as follows. Section 4.1 states the testbed for all the experi-
ments presented in this chapter. Section 4.2 shows approaches of modeling temperatures of
a hard drive disk and a solid state disk. Section 4.3 introduces a thermal modeling approach
for CPU. Section 4.4 demonstrates a outlet temperature model by taking into account of
inlet temperature and workloads. Section 4.5 evaluates the thermal models by comparing
the estimate values with real measurements. Finally, Section 4.6 concludes the chapter and
summarizes major contributions of this chapter.
4.1 Testbed
The testbed used in this chapter is equipped with a Celeron(R) 2.2 GHz CPU, 1.0
GBytes RAM, and a 500 GBytes SATA disk. External temperature sensors and MiniGoose
39
are applied to monitor the disk, inlet and outlet temperatures. The configuration parameters
are summarized in Table 4.1.
Table 4.1 Testbed Configuration for Advanced Thermal Models
Hardware Software
1 Intel(R) Celeron(R) 450@2.2GHz Ubuntu 10.04
1 1.0 GBytes of RAM Linux kernel 2.6.32
1 WD 500 GBytes Sata disk
(WD5000AAKS-75M0A0 [8])
4.2 Thermal Models of Disks
To study the thermal characteristics of HDDs (hard drive disks) and SSDs (solid state
disks),weinvestigatethethermalbehavioursofaWesternDigitharddrivedisk(WD5000AAKS[8])
and an Intel SSD (SSDSA2M080G2GC [2]). The specifications of these two disks are shown
in Table 4.2. The Intel SSD has faster sequential read rate than the Western Digit HDD,
but slower sequential write rate. In addition, the Intel SSD consumes much less energy than
the Western Digit HDD both in idle and active states.
Table 4.2 Specifications of the Two Disks
WD5000AAKS Intel SSD
Capacity(GB) 500 80
Sequential Read(MB/s) 126 250
Sequential Write (MB/s) 126 70
Power(Idle) 8.75 W 75 mW
Power(Active) 9.5W 150 mW
Throughout the rest of this section, the following four features are measured to study
disk thermal characteristics in the context of cluster storage systems.
1. Steady Temperature: The temperature of a disk that stays in a steady state.
2. Temperature Increment: The difference between an initial temperature and a steady
temperature when a disk is active.
40
3. Heat-up Time: A time interval during which a disk is heating up from its initial
temperature to a steady temperature when the disk is active.
4. Cool-down Time: A time interval during which a disk is cooling down from a steady
temperature to the disk?s initial temperature.
4.2.1 Ambient Impacts on Disk Temperatures
Evidenceshowsthatambienttemperatureshaveimpactsonprocessortemperatures[103];
however, little attention has been given to the impact of ambient temperatures on disk tem-
peratures. In the first step toward the coarse-grained thermal model, we conduct a group of
experiments to study the thermal impacts of ambient temperatures on disks.
Fig. 4.1 shows disk temperatures during an idle period when the computer room temper-
ature is set to 22.2 uni2103 , 22.8 uni2103 , 23.2 uni2103 , 23.8 uni2103 . We observe that the ambient temperature
does affect the temperature of the disks that are sitting idle. As shown in Fig. 4.1(a), when
the ambient temperature is 22.2 uni2103, the disk temperature of the Western Digital hard drive
disk is 26.49 uni2103. An ambient temperature of 23.8 uni2103 makes the disk temperature increase to
28.87 uni2103. An increase of 1.6 uni2103 in ambient temperature leads to an increment of 1.97 uni2103 on
disk temperature. While for the Intel SSD, as shown in Fig. 4.1(b), its temperatures are
24.86 uni2103 , 25.0 uni2103 , 25.75 uni2103 , 26.06 uni2103 , respectfully. It worth noting that, in idle state, the
temperature of the Intel SSD is lower than that of the Western Digital HDD under various
ambient temperatures. This result suggests that ambient temperature has directly impact
on the disk temperatures.
4.2.2 Various Number of Transactions
We control disk utilization by varying the number (i.e., 1000, 2000, and 5000) of I/O
transactions issued by Postmark. We set the computer room temperature to 23.2 uni2103 , and
use Postmark to launch three I/O-intensive tasks. Each task start running when the disk is
sitting idle until a steady state, with the initial disk temperature is 27.62 uni2103 for the Western
41
22.2 22.8 23.2 23.825
26
27
28
29
30
Ambient Temperature(?C)
Disk Temperature(
?C)
(a) Western Digital HDD
22.2 22.8 23.2 23.823
24
25
26
27
28
Ambient Temperature(?C)
Intel SSD Temperature(
?C)
(b) Intel SSD
Figure 4.1: Disk temperatures are affected by ambient temperatures.
42
Digital HDD and 25.75 uni2103 for the Intel SSD. Table 4.3 shows the features of the three tasks.
The number of files is set to 100, and file sizes are in a range between 1.E+6 and 1.E+8
Byte. All the other parameters of Postmark are set to the default values.
Table 4.3 Configurations of Tasks with Various Number of Transactions
Task1 Task2 Task3
File Number 100 100 100
Transactions 1,000 2,000 5,000
File Size(Byte) 1.E+6 - 1.E+8 1.E+6 - 1.E+8 1.E+6 - 1.E+8
The execution time of running these three tasks on the two disks are shown in Table 4.4.
We observe that, when tasks are running, the utilizations of both disks are 100%. By
comparing the the execution time, we could make a conclusion that the Intel SSD performs
better than the Western Digital HDD.
Table 4.4 Execution Time of Running Three Different Tasks
Disk Type Execution Time(s)Task1 Task2 Task3
Western Digital HDD 905 2115 5649
Intel SSD 803 1504 3733
ThetemperatureofthetwodisksareshowninFig.4.2. TheWesternDigitalHDD?stem-
peratures of running these tasks are shown in Fig. 4.2(a). When assigning 5000 transactions
to the hard drive disk, its peak temperature is 28.75 uni2103 ; when running 2000 transactions,
it peak temperature is 28.61 uni2103 . We observe that disk temperature goes up gradually and
it takes about 30 minutes for the disk to heat up to the peak temperature or cool down
from the peak temperature to its initial temperature. And the difference between the initial
temperature and the peak temperature is around 1.13 uni2103 .
TheexperimentalresultsofrunningthesethreetasksonIntelSSDareshowninFig.4.2(b).
The steady temperature of the Intel SSD in idle state is around 25.75 uni2103 . When it is fully
utilized, its temperature goes up very fast. While running 1000 or 2000 transactions on Intel
SSD, the peak temperature is not the same as running 5000 transactions. When running
43
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 15027.4
27.6
27.8
28
28.2
28.4
28.6
28.8
29
Time(\min)
Temperature(
?C)
 
 
1000 transactions
2000 transactions
5000 transactions
(a) Western Digital HDD
0 20 40 60 80 100 12025
25.5
26
26.5
27
27.5
28
28.5
29
Time(\min)
Temperature(
?C)
 
 
1000 transactions
2000 transactions
5000 transactions
(b) Intel SSD
Figure 4.2: Disk temperature of running different tasks.
44
5000 transactions, the Intel SSD?s temperature could be heated up to 28.75 uni2103 . Compared
with its initial steady temperature, there is an increment of 3.0 uni2103 . Thus, when analysing
the thermal characters of the Intel SSD, we would better consider the experimental results
of running 5000 transactions which ensures that the disk has been heated up to its steady
temperature in busy state. The Intel SSD?s heat-up stage is 20 minutes, and cool-down stage
is a little shorter than 20 minutes. Both of heat-up stage and cool-down stage for the Intel
SSD are shorter than that of the Western Digital HDD.
A comparison of temperatures with these two disks running 5000 transactions in the
chassis is shown in Fig. 4.3. We observe that inlet temperatures in both experiments keep
fluctuate between 23 uni2103 and 24 uni2103 . The temperature of the Western Digital hard drive
disk increases less than 1.5 uni2103 . However, for the Intel SSD, we find a significant increase of
the disk temperature for 3.0 uni2103 . On average, the Intel SSD steady temperature when fully
utilized is higher than that of the Western Digital HDD. In addition, Intel SSD?s execution
time is obviously shorter than that of the Western Digital HDD.
We summarize the execution time and heat-up, cool-down time of these two disks(see
Fig. 4.4) to make a better comparison. We observe that Western Digital HDD costs about
42% more time than Intel SSD to finish the task, which dues to SSD?s significant fast read
rate. And the HDD needs more time to heat-up or cool-down than Intel SSD.
Fig. 4.5 show the comparison of temperature data for these two disks. The HDD?s initial
temperature is about 1.87 uni2103 higher than that of Intel SSD. However, its peak temperature
and steady temperature in active state are less than that of the Intel SSD. From all of the
above, we conclude that Intel SSD is more sensitive to the disk activity, and it heats up and
cools down faster than the Western Digital HDD.
Let us consider heat up stage (the first 30 minutes) of running Task2 on the Western
Digital HDD. To formalize the disk thermal profile, we fit two models to the data in its heat
up stage. First, we use a polynomial model to fit the disk temperature Tdisk as a function
of time t as Tdisk(t) = ! t2 +   t +  . Then we fit a logarithmic model to represent the
45
0 20 40 60 80 100 12022
23
24
25
26
27
28
29
30
Temperature(
?C)
Time(\min)
 
 Disk
InletOutlet
(a) Western Digital HDD
0 20 40 60 80 100 12022
23
24
25
26
27
28
29
30
Time(\min)
Temperature(
?C)
 
 Intel SSD
InletOutlet
(b) Intel SSD
Figure 4.3: Thermal characteristics of running 5000 transactions.
46
Time(\sec)
Different Type of Disks
 
 
HDD Intel SSD0
1000
2000
3000
4000
5000
6000
Total
Transactions
Create File
Heat Up
Cool Down
Figure 4.4: Time comparison of disks running 5000 transactions.
Temperature(
?C)
Different Type of Disks
 
 
HDD Intel SSD25
25.5
26
26.5
27
27.5
28
28.5
29
29.5
30 Initial Temp
Peak TempAverage Steady Temp
Figure 4.5: Temperature comparison of disks running 5000 transactions.
47
disk temperature as Tdisk(t) =   ln(t) +  . The detailed parameters of these two models
are shown in Table 4.5.
Table 4.5 Parameters for Fitting Polynomial and Logarithmic Models to Disk Temperature
as a Function of Time
Disk Utilization(%) Polynomial Fit Logarithmic Fit!     
100 -0.002 0.09 27.62 0.3506 27.51
To validate the accuracy of these two models, we compare temperature obtained from
these models with those measured from the real-world disk. As shown in Fig. 4.6, for the
heat up stage, the estimate values offered by polynomial model are very close to the real
measurements with a precision error of 0.15% and standard deviation of 0.12%. And the
logarithmic model gains a precision error of 0.25% and standard deviation of 0.19%.
0 5 10 15 20 25 3027.4
27.6
27.8
28
28.2
28.4
28.6
28.8
Time(\min)
Temperature(
?C)
 
 
real measurementpoly Fit logarithmic Fit
Figure 4.6: Comparison of estimated disk temperatures with real measurements.
For the steady stage (the disk running at full utilization but its temperature stays
unchanged), we use a constant value 28.7 uni2103 to represent the disk temperature. And for
the cool down stage (disk state change from active to idle), we use the same process to
model the disk temperature. And we observe that in the cool down stage polynomial model
(Tdisk(t) = 0:0003 t2 0:0101 t+28:61) has a much better precision error than logarithmic
model (Tdisk(t) = 0:2430 ln(t) + 28:862). Here in these two models, t represent the time
48
in minutes that the disk stays in cool down stage. For the Intel SSD, we could apply the
same approach to model its temperature in the heat up or cool down stage.
4.2.3 Disk Temperatures under Various Utilizations
We first conduct five experiments on the Western Digital HDD to study the thermal
impacts of disk utilizations on disk temperatures. In each experiment, we assign one task to
the disk and let the task start while the disk is sitting idle in a steady state. The ambient
temperature is 23.2 uni2103 and initial temperature of the disk is 27.62 uni2103 . The number of files
and file sizes are the same as shown in Table 4.3.
We alter disk utilization by varying the write block size and buffering setting of Post-
mark. If buffer is enabled, then buffered stdio function calls should be used instead of the
lower level raw system calls [63]. All the other parameters of Postmark are set to their de-
fault values. The disk utilization is periodically assessed by the iostat utility program. The
experiment setting are summarized in Table 4.6.
Table 4.6 Postmark Configurations of Experiments on Disks
Scenarios 1 2 3 4 5
Buffer Enabled N N N N Y
Write Block Size(KB) 16 32 64 128 256
The utilizations and temperatures of the Western Digital HDD in these five experiments
are shown in Fig. 4.7 and Fig. 4.8. Fig. 4.7 exhibits that increasing write block size leads
to higher average disk utilization and shorter execution time. As shown in Fig. 4.8, the disk
temperatures explore three stages (heat up, steady, and cool down stages) and large write
block size results in higher disk temperature discrepancy.
The average disk utilizations in these five experiments are summarized in Table 4.7.
The results indicate that we are able to generate different disk utilization by choosing an
appropriate write block size with Postmark.
49
0 1000 2000 3000 4000 5000 6000 7000 8000 90000
10
20
30
40
50
60
70
80
90
100
110
Time(\sec)
Utilization(%)
 
 w16
w32w64
w128withbuf
Figure 4.7: Western digital HDD?s utilizations with various write block sizes.
0 20 40 60 80 100 12027.2
27.4
27.6
27.8
28
28.2
28.4
28.6
28.8
29
29.2
Time(\min)
Temperature(
?C)
 
 w16
w32w64
w128withbuf
Figure 4.8: Western digital HDD?s temperatures with various write block sizes.
Table 4.7 Western Digital HDD?s Utilizations under Various Write Block Sizes
Scenarios 1 2 3 4 5
Average Util(%) 14.24 28.91 53.49 80.57 100
50
0 5 10 15 20 25 3027.2
27.4
27.6
27.8
28
28.2
28.4
28.6
Time(\min)
Temperature(
?C)
 
 
Real MeasurementLogarithmic Fit Polynomial Fit
Figure 4.9: Western digital HDD?s temperature model validation (write block size: 128
Byte).
We use polynomial models and logarithmic models to fit the disk temperatures during
the heat up stage under different disk utilizations. A comparison of disk temperature in the
heat up stage under the utilization of 80.57% estimated by these two models and the real
measurement is shown in Fig. 4.9. The precision errors are 0.61% for the polynomial model
and 0.21% for the logarithmic model. Here, the logarithmic model fits the disk temperature
better than the polynomial model.
Our findings show that the polynomial and logarithmic models can successfully demon-
stratethedisktemperaturesduringtheheatupstageundervariousdiskutilization. Table4.8
summarizes the important parameters determined in our parameterization process. The av-
erage precision error for polynomial fitting is 0.47%, and for logarithmic fitting is 0.20%.
Hence, we conclude that the logarithmic model exhibits better curve fitting performance
than that of the polynomial model for estimating disk temperatures.
Then we run five experiments on the Intel SSD with the task configurations are the
same as Table 4.6. Write block sizes for this group of experiments are also set to 16, 32, 64,
51
Table 4.8 Parameters for Fitting Polynomial and Logarithmic Models to Western Digital
HDD?s Temperature under Various Utilizations
Utilization Polynomial Fit Logarithmic Fit!   err(%)   err(%)
14% -0.0018 0.0486 27.63 1.15 0.2130 27.50 0.18
29% -0.0007 0.0392 27.56 0.17 0.1983 27.56 0.19
53% -0.0001 0.0257 27.68 0.27 0.1918 27.73 0.17
80% -0.0018 0.0958 27.44 0.61 0.2382 27.53 0.21
100% -0.0020 0.0900 27.62 0.15 0.3506 27.51 0.25
128, and 256 Bytes respectively. Without buffering, the disk utilizations are different while
setting different write block sizes.
A comparison of the disk utilizations of the Western Digital HDD and the Intel SSD is
shown in Fig. 4.10. The average disk utilizations for the experiments running on the Western
Digital HDD are 14.24%, 28.91%, 53.49%, 80.57% while setting write block size to 16 Byte,
32 Byte, 64 Byte, and 128 Byte. And for Intel SSD, the disk utilizations are 11.00%, 30.57%,
52.90%, and 78.20%, respectively. Disk utilizations of these two disks are very close when
they are set to the same write block size without buffering. And we observe that higher
write block size leads to higher average disk utilization for both disks.
For the Western Digital HDD, the executing time is 8481 seconds when write block
size is 16 Bytes; while the write block size is 32 Bytes, the executing time is 4760 seconds;
when write block size is set to 64 Bytes, the running time of the task is 2973 seconds; and
setting write block size to 128 Bytes results in a task executing time of 2313 seconds. We
could draw a conclusion that larger write block size(/higher disk utilization) would result in
shorter execution time. For the Intel SSD, it is also the same that larger write block size
results in shorter execution time.
Forthesefiveexperimentswithdifferentwriteblocksizes,theinitialtemperature(/steady
temperature in idle state) of the Western Digital HDD is about 28 uni2103. While for Intel SSD,
its initial temperature is 25.75 uni2103. Under different disk utilizations, the highest temperatures
that the disks stay steadily are different. Peak disk temperatures of these experiments could
be summarized as Fig. 4.11. From this figure, we could observe that big write block size
52
Disk Utilization(%)
Write Block Size
 
 
16 32 64 1280
10
20
30
40
50
60
70
80
90
100
HDD
Intel SSD
Figure 4.10: Disk utilizations under different write block sizes.
Disk Temperature(
?C)
Write Block Size
 
 
16 32 64 128 with buf25
25.5
26
26.5
27
27.5
28
28.5
29
29.5
30
HDD
Intel SSD
Figure 4.11: Peak disk temperatures under different write block sizes.
53
results in high peak disk temperature. Both the HDD and SSD scenarios share a similar
trend in the sense that a large write block size leads to high disk utilization, which in turn
gives rise to high disk temperature. Thus, we could have a conclusion that disk utilization
has a positive impact on disk temperature.
4.2.4 Different Number of Disks
Onediskmayhaveamarginalimpactontheoutlettemperatureofadatanode; however,
multiple disks have profound impact on the thermal behavior of the data node. Our goal
is to investigate how do multiple disks affect outlet temperatures. The testbed used in this
set of experiments includes an Intel(R) Xeon 2.4 GHz CPU, 2.0 GBytes RAM. We vary
the number of disks in a data node from one to four. We test an I/O-intensive task that
issues 2000 transactions on each disk; the write buffer is enabled to make disks maintain
high utilization.
0 20 40 60 80 100 1202
2.2
2.4
2.6
2.8
3
3.2
3.4
3.6
3.8
4
Time(\min)
Temperature(
?C)
Inlet/outlet Temperature Difference
 
 
1 disk
2 disks
3 disks
4 disks
Figure 4.12: Inlet/outlet temperature differences in the cases of different numbers of disks.
54
Fig. 4.12 illustrates that when the disks are sitting idle, the initial differences between
inlet and outlet temperature are 2.4 uni2103 , 2.8 uni2103 , 2.9 uni2103 , and 3.4 uni2103 for one, two, three, and
four disks, respectively. Compared with the one-disk case, the four-disk case has a larger
inlet/outlet difference. On average, a disk contributes about 0.33 uni2103 increment of outlet
temperature, which takes almost more than 10% of the difference between inlet and outlet
temperature. If more than 16 disks are deployed in a data node, such a discrepancy between
inlet and outlet temperatures will be more pronounced. The peak values of inlet/outlet
temperature differences are 2.6 uni2103 , 3.1 uni2103 , 3.3 uni2103 , and 3.7 uni2103 for the one-disk, two-disk,
three-disk, and four-disk cases, respectively. We conclude that increasing the number of
disks in a data node can widen the gap between inlet and outlet temperatures of data nodes.
4.3 Thermal Model of CPU
We use the interior temperature sensor in CPU to monitor CPU temperature. To study
the thermal behaviour of CPU, we first let the CPU remain in idle state with the CPU
temperature is around 40 uni2103 . Then we run whetstone ? a float computation benchmark ?
to generate different experiment scenarios. In these scenarios, we make small modification
to the original whetstone benchmark to achieve different CPU utilizations by setting various
number of loops (i.e, 4000, 8000, 10000, 11900, 11950, and 12000). The CPU utilizations of
these experiments are shown in Fig. 4.13, and the CPU temperatures are shown in Fig. 4.14.
As shown in Fig. 4.13, when different number of loops are set, CPU utilizations are
relatively steady around specific values in the whole CPU active phases. In Fig. 4.14 shows
that CPU temperatures could also be categorised into three stages: heat up stage, steady
stage, and cool down stage. In the heat up stage, the CPU temperature goes up very quickly.
In the steady stage, CPU temperature remains the same with the CPU running at a stable
utilization. In the cool down state, CPU temperature cools down to its original temperature
55
0 500 1000 1500 2000 2500 3000 3500 4000 45000
10
20
30
40
50
60
70
80
90
100
Time(\sec)
Utilization(%)
 
 4000
800010000
1190011950
12000
Figure 4.13: CPU utilization under different scenarios.
0 500 1000 1500 2000 2500 3000 3500 4000 450039
40
41
42
43
44
45
46
47
48
49
50
51
Time(\sec)
Temperature(
?C)
 
 4000
800010000
1190011950
12000
Figure 4.14: CPU temperature under different scenarios.
56
which is equal to the CPU temperature in idle state because CPU-intensive workload has
been finished.
From the above two figures, we could conclude that increasing loop number leads to
higher CPU utilization and CPU temperature. In all these experiments when CPU is active,
CPU temperatures go up very quickly in the first 600 seconds (or 10 minutes), and then
CPU temperature remains steady. We conclude that the heat up time for CPU is around 10
minutes, and the cool down time for CPU is less than 10 minutes. CPU was cooled down to
its original temperature fast than been heated up.
Table 4.9 CPU Utilizations and Temperatures in the Steady Stage under Various Number
of Loops
Scenarios 1 2 3 4 5 6
Loop Number 4000 8000 10000 11900 11950 12000
Average Utilization(%) 13.8 26.7 33.1 65.2 77.9 90.5
Average Temperature(uni2103) 41.3 42.9 43.7 46.7 48 49.1
Max Temperature(uni2103) 48 50 49 49 51 50
Min Temperature(uni2103) 40 40 42 44 46 47
For better comparison, we summarize the average CPU utilizations and temperatures as
shown in Table 4.9. The average CPU temperatures in the steady stage are 41.3 uni2103 , 42.9 uni2103 ,
43.7 uni2103 , 46.7 uni2103 , 48.0 uni2103 , and 49.1 uni2103 , respectively. The maximum CPU temperatures in
steady stage are from 49 to 51 uni2103 , and the minimum CPU temperatures in steady stage are
increasing when loop number increases.
We use a polynomial model TCPU(t) =   t2 +   t +  and a logarithmic model
TCPU(t) =   ln(t) + to capture the characteristics of CPU temperatures during the heat
up stage under a wide range of CPU utilization. In these two models, t is the time in
seconds during which CPU is running under a specific utilization. A comparison of CPU
temperatures estimated by these two models and the real measurements when loop number
is 12000 is shown in Fig. 4.15. The precision error of the polynomial model is 1.87%, which
is higher than that of the logarithmic model (1.31%). The logarithmic model fits the curve
57
0 100 200 300 400 500 60040
42
44
46
48
50
52
Time(\sec)
Temperature(
?C)
 
 
real measurement poly fit logarithmic fit
Figure 4.15: CPU temperature model validation (12000LOOPS).
of CPU temperature better than the polynomial model when CPU is in the heat up stage.
For the other CPU utilizations, we also observe that logarithmic models achieve better curve
fitting performance than that of the polynomial models in most cases. Thus, logarithmic
models are selected for estimating CPU temperatures.
4.4 Thermal Model of a Data Node
To study the impact of CPU and disk on outlet temperature, we analyse the experiment
results of running the modified whetstone benchmark with 4000 iterations. We use a linear
model Tdiff = a + b x + c y to demonstrate the discrepancy between inlet and outlet
temperatures. In this model, Tdiff is the outlet temperature, x is the CPU temperature,
and y is the disk temperature. The parameters are shown in Table 4.10. And through
Toutlet = Tinlet +Tdiff, the outlet temperature of a data storage node can be estimated.
Fig. 4.16 shows the comparisons between the estimated temperatures and measured
ones after running the whetstone benchmark. The validation results confirm that the model
58
Table 4.10 Parameters for the Linear Model to Estimate Outlet Temperatures as a
Function of CPU and Disk Temperatures.
a b c
Linear Fit 4.842 0.0773 -0.2232
0 10 20 30 40 50 60 7024.8
25
25.2
25.4
25.6
25.8
26
26.2
Time(\min)
Temperature(
?C)
 
 real measurementestimated value
Figure 4.16: Validation of the outlet temperature model.
can be successfully applied to estimate outlet temperatures derived from CPU and disk
temperatures. The precision error of this model is as low as 0.5%. We also validate the
result of running the whetstone benchmark with other number of iterations; the precision
errors of the other experiment results are all below 0.5%.
4.5 Evaluation of Temperature Models
To verify the CPU, disk and outlet temperature models, we conduct an experiment
by running the WordCount benchmark on a given folder. This folder is composed of files
randomly generated by Postmark and locates in the Western Digital HDD. All these files
sum up to 10 GB. As shown in Fig. 4.17, the CPU and disk utilizations are relatively steady
59
when the benchmark is running. The average CPU utilization is 92.48%, and the average
disk utilization is 18.60%.
0 500 1000 1500 2000 25000
10
20
30
40
50
60
70
80
90
100
Time (\sec)
Utilization (%)
 
 cpu util
disk util
Figure 4.17: CPU and disk utilizations for running WordCount.
Now we demonstrate a way of applying our proposed models to estimate disk tempera-
ture. Three main steps are involved to derive estimated disk temperatures for a specific disk
utilization:
1. to choose several time stamps and record disk temperatures under different disk uti-
lizations;
2. to build a disk temperature model as a function of disk initial temperature, ambient
temperature, disk utilization; and
3. to estimate the disk temperature under a specific utilization using the model built in
the above step.
The above procedure allows us to estimate disk temperatures using ambient temperatures
and disk utilization.
60
Thefollowingshowsthedetailsofthedisk-temperatureestimationprocedure. Weobtain
the disk temperatures from our preliminary experiments when the disk utilization is 14.24%,
28.91%, 53.49%, 80.57%, and 100%, respectively. Then, six time stamps (i.e., 5, 10, 15, 20,
25, 30 minutes) are chosen and the disk-temperature equations are applied to estimate the
disk temperature under each time stamp. With the estimated temperature data, we apply
the logarithmic model to fit the disk temperature at these six time stamps when the disk
utilization becomes 18.60%. The results are summarized in Table 4.11.
Table 4.11 Estimated Disk Temperatures under a Specific Utilization
Time Real Measurement Estimation14.24% 28.92% 53.49% 80.57% 100% (18.60%)
5 27.83 27.88 28.04 27.92 28.20 27.85
10 27.99 28.02 28.17 28.07 28.39 28.00
15 28.07 28.09 28.25 28.17 28.50 28.07
20 28.14 28.15 28.30 28.24 28.58 28.13
25 28.18 28.19 28.35 28.29 28.64 28.17
30 28.22 28.23 28.39 28.34 28.69 28.21
Using the estimated disk temperatures under disk utilization of 18.60%, we derive the
disk-temperature model as follow:
Tdisk(t) = 0:2 ln(t) + 27:53; (4.1)
The above model can be used to predict disk temperatures during the heat up stage when
the disk utilization is 18.60%. According to the same process, we could generate the model
for estimating the CPU temperature under the utilization of 92.48% as follow:
TCPU(t) = 1:27 ln(t) + 42:01; (4.2)
The comparison of estimate CPU and disk temperature with the real measurements are
shown in Fig. 4.18 and Fig. 4.19.
61
0 100 200 300 400 500 60040
42
44
46
48
50
52
54
56
Time(\sec)
Temperature(
?C)
 
 
real measurementestimate value
Figure 4.18: CPU temperature model validation for WordCount.
0 5 10 15 20 25 3027.4
27.6
27.8
28
28.2
28.4
28.6
Time(\min)
Temperature(
?C)
 
 
real measurementestimate value
Figure 4.19: Disk temperature model validation for WordCount.
62
The precision of the CPU and disk models are 1.52% and 0.48%. Then, we apply the
sameoutlettemperaturemodeltothisexperiment, andwefoundthattheoutlettemperature
gains a precision error of 3.77%.
4.6 Summary
The goal of our study is to build a thermal model to estimate the outlet temperature
of a storage server (a.k.a., data node) based on processor and disk utilizations. Thermal
models play a key role in thermal management; however, traditional thermal models for data
centers do not take into account disk utilizations. In this chapter, we developed a thermal
model to investigate thermal impacts of hard disks on data nodes in storage clusters. Our
thermal models were developed at a coarse-grained level without the knowledge of detailed
specification of data nodes. Our experimental results show that our modeling approach
could predict the temperature of both disk and CPU with high accuracy. Furthermore, we
presented how to apply the thermal model to estimate the outlet temperature of a storage
server under certain processor and disk utilizations.
In this chapter, We make the following contributions:
1. we generated the thermal profile of a storage server. The profiling results are obtained
by running I/O-intensive workloads imposed by Postmark and CPU-intensive work-
loads by running Whetstone. When the disk and CPU are running under various load
scenarios, we monitor their temperatures as well as the inlet and outlet temperatures
of the data node with temperature sensors.
2. we built a thermal modeling approach for estimating temperatures of CPU and disk
under giving workloads.
3. we built an outlet temperature model by considering the thermal impacts of inlet
temperature, CPU and disk temperatures.
63
Our method enables data storage systems to save thermal monitoring costs. In addition,
our thermal models enable data center designers to make intelligent decisions on thermal
management during the design phase. Thermal management of storage systems helps to cut
cooling costs and boost system reliability. Monitoring temperatures is a key issue in thermal
management techniques; however, it is prohibitively expensive to acquire and set up a huge
number of sensors in a large-scale data center. Our modeling method is an alternative to
monitoring temperatures of storage systems.
Though most of the experiments in this chapter were conducted under the ambient
temperature of 23.2 uni2103. Our proposed approach can be applied to a data storage environ-
ment with various CRAC supply temperatures. Thus, when the CRAC supply temperature
changes, we may need to conduct the profiling experiments, which allow us to assign specific
parameter values to our model.
64
Chapter 5
Thermal-aware Task Scheduling
Now we propose a thermal-aware energy-efficient task scheduling system, where task
schedulers are introduced for dispatching incoming workloads. As has been verified in Chap-
ter 3 that scheduling tasks of computing and read/write in parallel could save energy than
the other two patterns. Thus, in our task scheduling system, we keep CPUs and disks as
busy as possible. The system consists of two components: a centralized thermal-aware task
scheduler that maintains a global task waiting list and a candidate node list that contains
the data nodes that are not fully utilized; and sub-schedulers that are installed in every data
node to maintain tasks assigned to them. The centralized task scheduler is responsible for
dispatching workloads according to properties of the tasks. In the process of task scheduling,
thermal issues are considered to avoid hot spots in data centers.
This chapter is organized as follows. First of all, Section 5.1 introduces the framework
of our thermal-aware task scheduling system. Then, the performance and efficiency of our
task scheduling system are presented in Section 5.2. Finally, Section 5.3 concludes the
contribution of our thermal-aware task scheduling system.
5.1 Framework
The framework of our management system for task scheduling is shown in Fig. 5.1. It
shows a data storage system withnnodes, and on top of the storage system, a thermal-aware
task scheduling system manages the workload assigned to the storage system. On each data
node, we deploy a sub-system, in which a monitor is applied to detect the utilization and
temperature of the components in this data node. Our system schedules the workload so
65
that the components (i.e., CPU, disk, and etc) work as hard as possible with the outlet
temperature does not exceed a threshold.
Global Waiting Task List
Node 1
Sub-scheduling 
System
Node n
Sub-scheduling 
System
Thermal-aware Task 
Scheduling System
...
Tasks Runtime
Data
Runtime
DataTasks
Waiting Task List
Ready Task List
Running Task List
Candidate Node List
Waiting Task List
Ready Task List
Running Task List
Figure 5.1: The framework of thermal management system for task scheduling.
The thermal-aware task scheduling system maintains two lists:
? a global waiting task list,
? a candidate node list.
The global waiting task list holds the tasks assigned. The candidate node list maintains the
data node that are in idle state or relatively light loaded.
In the global task list, the tasks are arranged in coming time ascending order. The
scheduling systems monitor the behaviours of all data nodes, and assign the tasks in global
task list to the candidate data nodes. Before assigning a task to a candidate data node, the
runtime information is fetch from the monitor of the data node, and temperature models are
applied to estimate the thermal impact of the task on the candidate data node. The task
would be assigned to this candidate data node if it would not introduce hot spot.
66
The sub-scheduling system also maintains three task lists on each data node:
? a waiting list,
? a ready list,
? and a running list.
The waiting list holds the tasks that must be executed in this particular data node, the ready
list holds the tasks that would be executed immediately. The running list holds the tasks
that are running on the data node. The sub-scheduling system manges these three task lists.
It maintains the runtime information of the running tasks, and lunches tasks from the ready
list on each node. While the ready list is empty, it introduces tasks from the waiting list
when these tasks would not lead the outlet temperature exceed a threshold.
Whenever a new task comes, the scheduling system will first check if the task is node-
relative. Here, a node-relative task refers to a task that need be executed on a particular
data node which hold the related data to complete the task.
If the task is not node-related, the task would be pushed into the global waiting task list.
And then the system will dispatch the tasks in the global waiting task list to the candidate
data nodes. When a task in the global waiting list is dispatched to a data node, this task
would be moved to the ready list of this data node.
If the task is node-related, system will check the monitor information from the desti-
nation data node, and estimate the CPU and disk utilization that the new task will lead
to. With the thermal models introduced in the previous section, how the outlet temperature
would be impact could be estimated. If the outlet temperature will not exceed the threshold,
the task would be put into the ready list of the data node, and been executed immediately.
However, if the outlet temperature is estimated to exceed the threshold, the new task will be
added to the waiting list of the data node. Until the system find that the new task will not
drive the outlet temperature to exceed the threshold, the task will be moved to the ready
list.
67
If the waiting list of a data node is too long that the tasks could not be finished in an
expected time period, the system will choose some candidate data nodes, and move some of
these tasks in the waiting list of the current data node to the candidate data nodes. When
determining which task to move in the waiting list, some rules are applied:
1. choose CPU-intensive task first, and then I/O-intensive task;
2. choose the task whose associated data could also be accessed in the candidate data
nodes;
3. choose the task that the size of its associated data is smaller than other tasks.
Afterdeterminingwhichtasktobemovedandwherethetaskshouldbemovedto, scheduling
system checks if the destination data node have the relative data. If the destination data
node has the required data, the task could be moved directly to the waiting list of the
destination node. If it has?t, then a new task will be generated to move the data from the
original data node to the destination data node. After the data movement, the task will
then be moved to the destination data node?s waiting list.
5.2 Experiments
To evaluate the performance of our task scheduling system, we conduct two groups of
experiments, which resemble various real-world workload scenarios. Table 4.1 shows the
parameters of a small-scale storage cluster of four data nodes. And throughout these exper-
iments, we set the outlet temperature threshold for each data node to 27 uni2103.
For tasks without any preferred data nodes, it is flexible for our task scheduler to
dispatch the tasks to any candidate nodes. While selecting the best candidate data node to
assigntasks, theschedulershouldaddressthefollowingissue. Theschedulermayassigntasks
to the least loaded data nodes or data nodes with the highest utilization. For comparison
purpose, we consider the following three scheduling policies:
68
? Distribute Evenly (DE): to evenly schedule tasks to all the data nodes in the first-in-
first-out order, thereby well balancing load among the nodes.
? Distribute based on Utilization (DU): to schedule tasks to as many as data nodes while
keeping active nodes? utilization at a high level.
? Distribute to Minimum Active Nodes(DMN): to schedule tasks in a way to minimize
the number of active data nodes.
5.2.1 CPU-intensive Workload
In the first group of experiments, we consider CPU-intensive workload. A total of ten
CPU-intensive tasks are running Whetstone on the cluster. These CPU-intensive tasks lead
to various CPU utilizations. The configuration and average utilization for each task are
summarized in Table 5.1.
Table 5.1 Task Configurations of CPU-intensive Workloads
Tasks Task 1 Task 2 Task 3 Task 4 Task 5
LOOPS(#) 4000 8000 10000 11820 11850
Avg Util(%) 13 25 32 44 52
Tasks Task 6 Task 7 Task 8 Task 9 Task 10
LOOPS(#) 11900 11930 11980 12020 12050
Avg Util(%) 64 72 85 96 100
Let us consider a baseline task scheduler that assigns all tasks to a single data node,
thereby making use of the least number of active data nodes. We conduct experiments to
assign all the ten tasks to one of the four available data nodes, and the tasks are sequentially
executed on the node. The average time to complete the ten tasks scheduled by this baseline
approach is 6131 seconds.
Table 5.2 lists the three task scheduling strategies under the CPU-intensive workload
conditions. The DE strategy evenly assigns tasks to the four data nodes. For instance, on
data node 1, tasks 1 and 5 are concurrently executed; task 9 is running after the completion
69
Table 5.2 Task Scheduling Schemes for CPU-intensive Workloads.
Strategies Node 1 Node 2 Node 3 Node 4
DE Task 1, 5, 9 Task 2, 6, 10 Task 3, 7 Task 4, 8
DU Task 1, 8, 9 Task 2, 7, 10 Task 3, 6 Task 4, 5
DMN Task 1, 2, 3, 8 Task 4, 5, 9 Task 6, 10 Task 7
of task 1 and 5. When the DU strategy is in charge of the scheduling, tasks 1 and task 8
are executed simultaneously on node 1, where the CPU utilization is as high as 98%. After
completing tasks 1 and 8, node 1 start running task 9. With the DU strategy in place, each
node keeps a high CPU utilization, while ensuing that its CPU is not overloaded. When it
comes to the DMN strategy, new tasks are scheduled to minimize the number of active data
nodes. Thus, tasks 1, 2, and 3 are all assigned to data node 1.
DE DU DMN Baseline0
1000
2000
3000
4000
5000
6000
7000
8000
Task Scheduling Strategies
Time (s)
 
 Execution Time
Active Time
Figure 5.2: Execution time and active time of data nodes under CPU-intensive Workloads.
Fig. 5.2 reveals the performance of the three scheduling strategies. Execution times are
referred to the time spent in completing all submitted tasks; active times are defined as the
accumulation of time intervals in which the four data nodes are staying in the active state.
Experimental results show that the outlet temperatures of the data nodes do not exceed the
specified threshold.
70
DE DU DMN Baseline0
2000
4000
6000
8000
10000
12000
14000
Task Scheduling Strategies
Energy Consumption (W)
Figure 5.3: Energy consumption of data nodes under CPU-intensive Workloads.
Fig. 5.3 compared the baseline scheme with the three evaluated strategies in terms of
energy consumption.
Among all the four scheduling strategies, the baseline one exhibits the longest execution
time and consumes the most energy. By comparing the three evaluated strategies, the DMN
strategy achieves the best performance, whereas DE delivers the highest energy efficiency.
For example, DE saves the energy consumption of the other strategies by 3.8%, and DE also
conserves the energy consumption of the baseline scheme by 28.9%. Thus, we could conclude
that the DE strategy is the best scheduler for CPU-intensive load on storage clusters.
5.2.2 I/O-intensive Workload
In the second group of experiments, we assigned ten I/O-intensive tasks to the cluster.
Each task generates 50 files and issues 200 transactions. We change the write block size to
vary the disk utilization of each data node. The characteristics of these I/O-intensive tasks
are shown in Table 5.3.
71
Table 5.3 Task Configurations of I/O-intensive Workloads.
Tasks Task 1,2 Task 3,4 Task 5,6 Task 7,8 Task 9,10
Write Block Size(Byte) 16 32 64 128 256
Avg Util(%) 14 29 54 81 100
A baseline scheme assigns all the tasks to a single data node. We compare the aforemen-
tioned scheduling strategies with the baseline one. Table 5.4 shows the three task scheduling
schemes.
Table 5.4 Task Scheduling Schemes for I/O-intensive Workloads.
Strategies Node 1 Node 2 Node 3 Node 4
DE Task 1, 5, 9 Task 2, 6, 10 Task 3, 7 Task 4, 8
DU Task 1, 8, 9 Task 2, 7, 10 Task 3, 6 Task 4, 5
DMN Task 1, 2, 3, 4 Task 5, 6 Task 7, 9 Task 8, 10
Fig. 5.4 shows the performance of the evaluated scheduling strategies. The results
reveal that regardless of the tested schedulers, the outlet temperatures are kept below the
pre-defined threshold. And the energy consumption of the cluster managed by the three
strategies compared with the baseline one can be found in Fig. 5.5.
Not surprisingly, the baseline strategy is outperformed by the three other schedulers in
terms of execution time and energy consumption. The utilization-based scheduler is superior
to the other three schemes in performance. The most energy efficient scheduler is the one
(i.e., DE) that evenly distribute the load across all the four data nodes; this energy-efficient
scheduler save the energy consumption of the baseline and the other schemes by 10.8% and
3.4%, respectively. Again, DE is the best scheduler for I/O-intensive workload.
In summary, under both CPU-intensive and I/O-intensive workload conditions, evenly
distributing load across active data nodes is very energy efficient.
72
DE DU DMN Baseline0
1000
2000
3000
4000
5000
6000
7000
Task Scheduling Strategies
Time (s)
 
 
Execution Time
Active Time
Figure 5.4: Execution time and active time of data nodes under I/O-intensive workloads.
DE DU DMN Baseline0
2000
4000
6000
8000
10000
12000
Task Scheduling Strategies
Energy Consumption (W)
Figure 5.5: Energy consumption of data nodes under I/O-intensive workloads.
73
5.3 Summary
Energy-aware task scheduling policies were proposed to redistribute workloads in order
to minimize the energy consumption of computing infrastructures. We incorporated our
thermal models into a thermal-aware management system that distributes tasks to ensure
data nodes thermal and energy friendly. This system is integrated into a task scheduler
that dispatches and redistributes tasks in a way that all the data nodes? outlet temperatures
are below a given threshold. Three strategies were considered in the process of determining
which candidate data node should be selected. Through experiments of dispatching CPU-
intensive and I/O-intensive workloads, we made a conclusion that evenly distributing the
workload across active data nodes is more energy-efficient than the other two strategies.
74
Chapter 6
Thermal-aware Data Placement
Our evidence shows that disks have non-negligible thermal impacts on the temperature
of data nodes (see Chapter 3 and Chapter 4). In this chapter, we demonstrate that data
placement strategies can significantly affect thermal performance of data nodes. Firstly, we
study the thermal impacts of data placement strategies in a homogeneous environment in
Section 6.1. Then, in Section 6.2, we consider the thermal impacts of data placement in a
hybrid data storage system. Finally, Section 6.3 concludes this chapter.
6.1 Homogeneous Disk Arrays
After developing a thermal model for a single disk, we are in position to investigate
thermal behaviors of multiple disks. Nowadays, a single data node used to have multiple
disks. For instance, a single Teradata equipment is able to support more 100 disks. To study
how multiple disks deployed in a single data node would affect each other and how they
would affect the outlet temperature of a data node, we conduct two groups of experiments.
Thenumberofdisksinthesetwogroupsofexperimentsaresettotwoandthree, respectively.
In this study, we use the internal disk sensors to monitor the disk temperatures because the
temperature sensors are not able to applied to the disks in a disk array.
6.1.1 The Two-Disk Case
In the first group of experiments, two disks are configured in the data node. In this
data placement study, we use the same testbed described in Chapter 3. It is noteworthy that
both disks are placed inside the node?s chassis rather than an external disk array. These two
disks are of the same type. Compared with disk 2, disk 1 is kept closer to the fan. The initial
75
disk temperature of disk 1 is 36 uni2103 , and the initial disk temperature of disk 2 is 38 uni2103 . Two
I/O-intensive tasks driven Postmark are running on the two disks. We leverage Postmark
to create 100 files, the size of which ranges anywhere between 1 to 100 MBytes. Each of the
two tasks issues a total of 2,000 transactions.
Table 6.1 The Two-Disk Scenarios
Disk 1 Disk 2
Scenario 1 Task 1 Task 2
Scenario 2 Task 1 & 2
Scenario 3 Task 1 & 2
We set up three scenarios summarized in Table 6.1. In scenario 1, the two tasks are
keeping both disks busy. In scenarios 2 and 3, the two tasks are accessing on one disk while
keeping another disk idle.
0 3000 6000 9000 12000 15000 1800036
37
38
39
40
41
42
43
Time (\sec)
Temperature (
?C)
 
 
disk1scenario1
disk2scenario1
disk1scenario2
disk2scenario2
disk1scenario3
disk2scenario3
Figure 6.1: Thermal impacts of data placement in the two-disk case.
Fig. 6.1 shows the disk temperatures in the three tested scenarios. In scenario 1, the
temperature of disk 1 increases by 4 uni2103 , and disk 2 increases by 3 uni2103. In scenario 2, after
running for a few minutes, the temperature of disk 1 increases by 3 uni2103 , and the temperature
76
of disk 2 increases by 1 uni2103. In scenario 3, the temperature of disk 2 increases by 4 uni2103 , and
the temperature of disk 1 increases by 2 uni2103 as well.
Table 6.2 Peak Average Disk Temperatures and Total Task/Application Execution Times
Scenarios Peak Average Execution Time(s)Temperature ( uni2103 ) Task Application
Scenario 1 40.5 4,136 2,250
Scenario 2 39.0 10,632 5,323
Scenario 3 40.0 7,948 3,981
Table 6.2 compares the execution times and peak average temperatures of the two disks
tested in the three scenarios. Task execution time is the sum of the two tasks? execution
times; application execution time is the maximum execution time of the two tasks involved in
the application. We observe that scenario 3 results in the shortest accumulative active disk
time (i.e., 3,981 seconds) compared with scenario 1 (i.e., 4,136 seconds) and scenario 2 (i.e.,
5,323 seconds), concluding that disks tested in scenario 3 may consume the least energy.
Evenly distributing requests issued by the application to the two disks (see scenario 1)
produces a high average disk temperature. However, scenario 1 exhibits smaller application
execution time than those of scenarios 2 and 3. More interestingly, issuing requests to disk
1 that is closer to the fan in the chassis (see scenario 2) gives rise to the lowest average disk
temperature. This result reveals that scenario 2 is more thermal friendly than the other two
scenarios.
6.1.2 The Three-Disk Case
We deploy three disks inside a disk-array chassis connecting to the HP server. The
testbed is shown in Table 6.3. The disk-array chassis has a fan to cool down disks. We
use postmark to initially create 100 files, the size of which ranges from 1 to 100 MBytes.
Three postmark tasks issue 1,000 requests to the disks. Ten scenarios (see Table 6.4) are
investigated in this group of experiments. In the first scenario, the three tasks are accessing
77
the three disks. In the next three scenarios, the three tasks are sharing a single disk. And
for the other scenarios, different task assignments are examined.
Table 6.3 Testbed Configuration for Three-Disk Case
Hardware Software
4 Intel(R) Xeon 2.4 GHz CPU X3430 Ubuntu 10.04
1 2.0 GBytes of RAM Linux kernel 2.6.32
3 WD 160 GBytes Sata disk lm-sensors [3]
(WD1600AAJS-75M0A0 [7]) hddtemp [1]
Table 6.4 The Three-Disk Scenarios
Disk 1 Disk 2 Disk 3
Scenario 1 Task 1 Task 2 Task 3
Scenario 2 Task 1 & 2 & 3
Scenario 3 Task 1 & 2 & 3
Scenario 4 Task 1 & 2 & 3
Scenario 5 Task 1 & 2 Task 3
Scenario 6 Task 1 & 2 Task 3
Scenario 7 Task 1 Task 2 & 3
Scenario 8 Task 1 Task 2 & 3
Scenario 9 Task 1 & 2 Task 3
Scenario 10 Task 3 Task 1 & 2
Fig. 6.2 and fig. 6.3 plots the disk utilization and temperature of the first four scenarios
examined in the three-disk case. Fig. 6.4 and fig. 6.5 show other scenarios of task assign-
ment. The peak average temperatures of three disks, the task/application executing times
and the estimated cooling cost of each scenario are summarized in Table 6.5, where task
execution time is the sum of the three tasks? execution times; application execution time is
the maximum execution time of the three tasks within the application.
We observe that evenly distributing tasks to the disks (i.e., scenario 1) leads to higher
temperatures on average than forcing all the tasks to share a single disk, however, it takes
1,500 seconds (the shortest time) to complete all the I/O requests. Fig. 6.2(a) shows that
the temperatures of disk 1 and 2 increase by 2 uni2103; the temperature of disk 3 increases by
78
0 500 1000 1500 2000 2500 300034
35
36
37 Temperature and utilization of 3 disks
Temperature(
?C)
 
 
0 500 1000 1500 2000 2500 30000
50
100
Time(\sec)
Utilization(%)  
 
Tdisk1 Tdisk2 Tdisk3
Udisk1 Udisk2 Udisk3
(a) Scenario 1
0 500 1000 1500 2000 2500 3000 3500 400034
35
36
37 Temperature and utilization of 3 disks
Temperature(
?C)
 
 
0 500 1000 1500 2000 2500 3000 3500 40000
50
100
Time(\sec)
Utilization(%)  
 
Tdisk1 Tdisk2 Tdisk3
Udisk1 Udisk2 Udisk3
(b) Scenario 2
Figure 6.2: Thermal impacts of data placement in the three-disk case.
1 uni2103. When the three tasks are sharing one disk, the disk temperature increases by 2 uni2103 ,
whereas temperatures of the other two disks remain unchanged. We conclude that sharing a
disk among multiple tasks can maintain low disk temperatures at the cost of increased I/O
processing time (e.g., from 1,500 to 3,000 seconds).
In both scenarios 5 and 6, two tasks are issuing I/O requests to disk 1 and the third task
is sending I/O requests to another disk. The task execution times in these two scenarios are
2,616 and 4,271 seconds, respectively. The long execution time of scenario 6 keeps the three
disksinahighertemperaturethantheinitialstate. Fig.6.4(a)showsthatthetemperatureof
79
0 500 1000 1500 2000 2500 3000 350034
35
36
37 Temperature and utilization of 3 disks
Temperature(
?C)
 
 
0 500 1000 1500 2000 2500 3000 35000
50
100
Time(\sec)
Utilization(%)  
 
Tdisk1 Tdisk2 Tdisk3
Udisk1 Udisk2 Udisk3
(a) Scenario 3
0 1000 2000 3000 4000 5000 6000 7000 800034
35
36
37 Temperature and utilization of 3 disks
Temperature(
?C)
 
 
0 1000 2000 3000 4000 5000 6000 7000 80000
50
100
Time(\sec)
Utilization(%)  
 
Tdisk1 Tdisk2 Tdisk3
Udisk1 Udisk2 Udisk3
(b) Scenario 4
Figure 6.3: Thermal impacts of data placement in the three-disk case(2).
disk 1 increases by 3uni2103, and the temperature of disk 2 increases by 1uni2103. Fig. 6.4(b) indicates
that the temperatures of disks 1 and 3 both increase by only 1uni2103.
In scenarios 7 and 9, two tasks are assigned to disk 2 and the third one is allocated
to the third disk. The execution times of these three tasks are very close. Figs. 6.4(c) and
6.5(b) show that the temperature of disk 2 increases by 3uni2103. The temperature of disk 1 in
scenario 7 rises by only 1uni2103; however, the temperature of disk 3 in scenario 9 goes up by 2uni2103.
The disks lead to higher energy consumption in scenario 7 than in scenario 9.
80
0 1000 2000 3000 4000 5000 600034
35
36
37
Temperature(
?C)
Temperature and utilization of 3 disks
 
 
0 1000 2000 3000 4000 5000 60000
50
100
Time(\sec)
Utilization(%)  
 
Tdisk1 Tdisk2 Tdisk3
Udisk1 Udisk2 Udisk3
(a) Scenario 5
0 500 1000 1500 2000 2500 3000 3500 400034
35
36
37
Temperature(
?C)
Temperature and utilization of 3 disks
 
 
0 500 1000 1500 2000 2500 3000 3500 40000
50
100
Time(\sec)
Utilization(%)  
 
Tdisk1 Tdisk2 Tdisk3
Udisk1 Udisk2 Udisk3
(b) Scenario 6
0 1000 2000 3000 4000 5000 6000 700034
35
36
37
Temperature(
?C)
Temperature and utilization of 3 disks
 
 
0 1000 2000 3000 4000 5000 6000 70000
50
100
Time(\sec)
Utilization(%)  
 
Tdisk1 Tdisk2 Tdisk3
Udisk1 Udisk2 Udisk3
(c) Scenario 7
Figure 6.4: Thermal impacts of data placement in the three-disk case(3).
81
0 500 1000 1500 2000 2500 3000 3500 4000 450034
35
36
37
Temperature(
?C)
Temperature and utilization of 3 disks
 
 
0 500 1000 1500 2000 2500 3000 3500 4000 45000
50
100
Time(\sec)
Utilization(%)  
 
Tdisk1 Tdisk2 Tdisk3
Udisk1 Udisk2 Udisk3
(a) Scenario 8
0 1000 2000 3000 4000 5000 600034
35
36
37
Temperature(
?C)
Temperature and utilization of 3 disks
 
 
0 1000 2000 3000 4000 5000 60000
50
100
Time(\sec)
Utilization(%)  
 Udisk1 Udisk2 Udisk3
Tdisk1 Tdisk2 Tdisk3
(b) Scenario 9
0 1000 2000 3000 4000 5000 6000 700034
35
36
37
Temperature(
?C)
Temperature and utilization of 3 disks
 
 
0 1000 2000 3000 4000 5000 6000 70000
50
100
Time(\sec)
Utilization(%)  
 Udisk1 Udisk2 Udisk3
Tdisk1 Tdisk2 Tdisk3
(c) Scenario 10
Figure 6.5: Thermal impacts of data placement in the three-disk case(4).
82
Table 6.5 Peak Average Disk Temperatures, Execution Times and Estimated Cooling Costs.
Scenarios Peak Average Execution Time(s) CoolingTemp( uni2103 ) Task Application Cost(J)
Scenario 1 36.35 4144 1500 23,655
Scenario 2 35.33 3010 3010 48,527
Scenario 3 35.00 3024 3024 48,671
Scenario 4 35.00 3126 3126 50,065
Scenario 5 35.34 2616 1768 28,469
Scenario 6 34.67 4271 2551 41,169
Scenario 7 35.34 3032 2134 34,340
Scenario 8 35.00 4466 2751 44,370
Scenario 9 35.33 2717 1846 29,723
Scenario 10 35.35 3227 2063 33,244
Whenitcomestoscenarios8and10, disk3handlesrequestsfromtwotasks, andanother
disk deals with the requests from the third task. The task execution time of the scenario 8
is much longer than that of scenario 10. Let us consider the first 4,000 seconds during the
testing process. Figs. 6.5(a) and 6.5(c) illustrate that the average temperature of the three
disks in scenario 10 is higher than that in scenario 8. These results confirm that assigning
tasks to a disk sitting in the middle can give rise to high disk temperatures and low energy
efficiency.
From Table 6.5, we observe that the cooling cost of scenario 1 is the least and cooling
cost of scenario 4 is the most. From the above experiments, we conclude that though evenly
distribute tasks have the highest peak average temperature because a load balancing strategy
which makes disks stay in high temperatures for less time offers better overall performance,
and it is more energy-efficient.
6.1.3 Data Placement Strategy
The previous subsection shows evidence that outlet temperatures affected by disks vary
greatly among the tested cases. In the three-disk case, we chose to evaluate ten scenarios out
83
of many other possibilities. For example, one possible scenario might be that the workload is
composed of tasks that are of different disk utilizations or of different execution times. And
to provide large storage capacity, one may increase the number of disks in each data node.
Manually measuring all possible scenarios is a time-consuming and impractical process. A
promising solution is to use real measurements collected in simple disk configurations, and
to model the thermal characteristics of other complicated scenarios.
Our results suggest that disk temperatures significantly affect the outlet temperatures
of a node. Disk temperatures in turn depends on data placement and I/O activities. These
observations motivate us to study thermal-aware data placement strategies, which aim to
migrate data among disks in order to minimize the cooling costs.
Let us consider a storage cluster containing a large number of data nodes. Encouraged
by our experimental results presented in the previous sections, we propose a thermal-aware
data placement strategy that is composed of two stages:
? Initial stage: place data sets in data nodes in a way that all the nodes have very similar
outlet temperatures.
? Redistribution stage: migrate data according to temperature distribution measured by
sensors and predicted by our models.
In the initial stage, a large amount of data must be written into data nodes of a storage
cluster. A straightforward strategy is to evenly distribute data across all the data nodes in
the system. Data nodes of a storage cluster can be configured in two ways. The first strategy
is designed for storage clusters where nodes have the same number of disks deployed. In this
strategy, more amounts of data is placed on disks whose temperature in the idle state is
higher than other disks. The second strategy is tailored for heterogeneous storage clusters
where nodes have different number of disks. In this case, data nodes equipped with more
disks should handle a less amount of data in order to reduce heat stress.
84
After the initial stage of a storage cluster, the data access patterns are likely to change
dynamically. For example, some data sets are accessed more frequently than the other data.
The storage cluster tends to exhibit unbalanced outlet temperatures of the data nodes. To
balance thermal stresses, the data placement mechanism migrates hot data sets from nodes
withhighoutlettemperaturestothosewithlowoutlettemperatures. Thedataredistribution
process is triggered by a threshold of outlet temperatures. For instance, when the maximum
outlet temperature is 25% higher than the average temperature of all the nodes, the data
redistribution process begins. To maintain high I/O performance, our mechanism delays the
redistribution process until the nodes involved in the migration procedure have very large
I/O load.
6.2 Hybrid Storage Clusters
Afterstudyingthethermalimpactsofdataplacementstrategiesonhomogeneousstorage
systems, we are in position to investigate thermal behaviors of hybrid disks in the context
of cluster storage systems, each of which is comprised of a number of storage nodes. Thanks
to good I/O performance offered by SSDs, future cluster storage systems are likely to be
powered by a large number of hybrid disks containing both HDDs and SSDs. In this section,
we pay attention to the thermal behaviors of two types of hybrid storage clusters. We show
thatdataplacementisanefficientapproachtominimizenegativethermalimpactsofahybrid
storage cluster for high-performance clusters.
6.2.1 System Configuration of Hybrid Storage
In this part of study, we build two types of hybrid cluster storage systems, namely,
inter-node and intra-node hybrid cluster storage systems (see Fig. 6.6). In an inter-node
hybrid cluster storage system, there are two types of storage nodes ? SSD-enabled nodes
and HDD-enabled nodes. All disks in an SSD-enabled node are solid state disks, whereas
all disks in an HDD-enabled node are hard drives. In an intra-node hybrid cluster storage
85
system, each node contains both solid state disks and hard drives. Intra-node hybrid cluster
storage systems are homogeneous systems in the sense that all the nodes share an identical
configuration. In contrast, inter-node hybrid systems are heterogeneous systems because
some nodes are equipped with SSDs while others are comprised of HDDs.
HDD
HDD
... SSD
SSD
(a) Inter-Node
HDD
SSD ...
HDD
SSD ...
HDD
SSD
HDD
SSD
(b) Intra-Node
Figure 6.6: Two types of hybrid cluster storage systems.
6.2.2 Case Studies
We investigate HDD-first and SSD-first data placement strategies, in which data would
bedistributedtoeitherHDDsorSSDs. ByusingtheHDD-firststrategy,oneoftheHDDswill
be randomly selected if both HDDs and SSDs are available; while the SSD-first strategy will
choose SSDs at first. In our evaluation, the inter-node hybrid storage cluster is comprised
of 128 SSD-enabled nodes and 128 HDD-enabled nodes. The intra-node hybrid storage
cluster has 256 nodes. We make use of Postmark to resemble 128 I/O-intensive tasks, in
each of which 1,000 files are created and 5,000 I/O requests are issued. We set the outlet
temperatures of nodes to 40 uni2103.
86
Inter-Node Hybrid Storage Cluster
In an inter-node hybrid storage cluster (see Fig. 6.6(a)), the I/O tasks will be evenly
issued to the HDD-enabled nodes by the HDD-first strategy. In this case, the requests
can be completed within 88 minutes based on our preliminary experiments. According to
the HDD temperature model, the working HDD temperature increases to 28.40 uni2103 . The
temperature of another HDD in the node remains 27.50 uni2103 . The temperatures of both SSDs
residing in SSD-enabled nodes remain unchanged (i.e., 25.75 uni2103 ). We define the average
value of two disk temperatures as the disk temperature of a storage node. The discrepancy
between inlet and outlet temperatures of HDD-enabled nodes is Tdiff(27:95) = 2:10 ; the
discrepancy between inlet and outlet temperatures of SSD-enabled nodes is Tdiff(25:75) =
1:43 . Therefore, if the inlet temperatures of HDD-enabled and SSD-enabled nodes are
37.90 uni2103 and 38.57 uni2103 respectively, we could get the same outlet temperature of 40 uni2103 .
Since our preliminary experiments show that there is about 8 uni2103 difference between the
inlet temperature and the air-conditioner supply temperature, the air-conditioner supply
temperatures should be set to 29.9 uni2103 for HDD-enabled nodes and 30.57 uni2103 for SSD-enabled
nodes in order to gain the same outlet temperature of 40 uni2103 .
The power consumptions of a HDD-enabled and a SSD-enabled node are 66.25 W and
48.9 W in idle state. The COP model (see Fig. 3.6 in Section 3.7) indicates that the COP
values of HDD-enabled and SSD-enabled nodes are 6.56 and 6.84. Let?s consider the power
consumption of this inter-node cluster. The mechanical power consumptions are 353,760 J
for a HDD-enabled node and 258,129 J for an SSD-enabled node. Using the COP values,
we estimate that the cooling costs with respect to HDD-enabled and SSD-enabled nodes are
53,917 J and 37,362 J. Therefore, the total energy consumption incurred by the inter-node
hybrid storage cluster and its cooling system is 90,064,864 J.
By using the SSD-first strategy, the I/O requests will be evenly handled by SSD-enabled
nodes. In this case, the requests can be finished within 62 minutes based on preliminary
results. The temperature of the active SSD is 28.75 uni2103 , whereas the other SSD and HDDs
87
remain at 25.75 uni2103 and 27.50 uni2103 . At HDD-enabled nodes, the difference between inlet
and outlet temperatures is 1.96 uni2103 ; such temperature difference at SSD-enabled nodes is
1.88 uni2103 . Thus, The inlet temperatures of HDD-enabled and SSD-enabled nodes are nearly
38.04 uni2103 and 38.12 uni2103 . And the supply temperatures are 30.04 uni2103 for HDD-enabled nodes
and 30.12 uni2103 for SSD-enabled nodes. Using the same method, we could calculate the total
power consumption of this case is 63,139,305 J. The SDD-first strategy could save 42.64%
power consumption than the HDD-first strategy in the Inter-node Hybrid Storage Cluster.
Intra-Node Hybrid Storage Cluster
In an intra-node hybrid storage cluster, the I/O requests will be processed by HDDs
in 128 nodes under the HDD-first strategy. The other 128 nodes will remain idle. If the
SSD-first strategy is applied, the only difference from the HDD-first case is that the I/O
requests will be executed on SSDs rather than HDDs.
Due to the space limitation, we do not present the intermediate results that can be
calculated in a similar way. The total energy consumption is 90,022,885 J under the HDD-
first strategy, and 63,137,638 J under the SSD-first strategy. The SSD-first strategy reduces
the energy consumption by 42.58%.
WeobservethatthetotalenergyconsumptionoftheHDD-firststrategyonaninter-node
hybrid cluster is the maximum one, and using the SSD-first strategy on intra-node hybrid
cluster results in the minimum total power consumption. In the same hybrid architecture,
the SSD-first strategy will save more power than the HDD-first strategy. We conclude that
keeping SSD active in the intra-node hybrid storage cluster can achieve the best energy
efficiency.
6.3 Summary
In this chapter, we first studied the impact of data placement on the cooling cost and
thermal performance of storage system, and proposed a thermal-aware energy-efficient data
88
placement strategy. Then, we built two types of hybrid storage clusters, namely, inter-node
and intra-node hybrid storage clusters. Compared with the HDD-first strategy, SSD-first
strategy is an efficient approach to minimize negative thermal impacts of hybrid storage
clusters for cluster computing.
89
Chapter 7
Predictive Thermal-aware Energy-efficient Data Transmission
Growing data transmission has become a crucial type of workload in data centers. This
chapterpresentsanovelPredictiveThermal-AwareManagementSystem(PTMS)thatisable
to reduce the energy cost of storage systems by appropriately selecting data transmission
methods. We evaluate the energy consumption of three methods (1. transfer data without
archiving and compression; 2. archive and transfer data; 3. compress and transfer data) in
preliminary experiments. According to the results, we observe that the energy consumption
of data transmission greatly varies case by case. We cannot simply apply one method in
all cases. Therefore, we design an energy prediction model to estimate the total energy
cost of data transmission by using particular transmission methods. Based on the model,
our predictive energy-aware management system can automatically select the most energy
efficient method for data transmission.
This chapter is organized as follows: Section 7.1 introduces related information about
data transmission in data centers; Section 7.2 shows preliminary results of applying three
data transmission strategies in transferring two different types of datasets; Section 7.3 is
the framework of our predictive thermal-aware management system; Section 7.4 presents
the efficiency of our PTMS system in transferring two large datasets; finally, Section 7.5
concludes this chapter.
7.1 Introduction
Adatatransmissionbetweentwodatanodesiscomposedofthreephases: pre-transmission
phase; transmission phase; and after-transmission phase. In the first phase, data are read
from disk to cache on the original data note. Then in the second phase, data are transformed
90
from original data node to destination data node through network. In the last phase, data
are written to the destination disk.
Frequent data transmission contributes to a large portion of energy consumption of
data centers. Data placement strategies and data reuse methods are proposed to reduce the
energy cost of potential data movements. A new trend of decreasing energy consumption
of data transmission is to compress the data before transforming it. In a preliminary work,
thermal behaviour of data compression has been investigated [60]. Our predictive thermal
management system aims at decreasing the thermal impact of these data transmissions and
reducing the total energy cost of data nodes in data centers.
We study three transmission strategies in transforming various data resources:
? Direct Transmission
? Archived Transmission
? Compressed Transmission
In Direct Transmission (DT for short), data are transferred over the network directly,
without archive or compression. In the transmission phase, the original data is transferred.
In Archived Transmission (AT for short), data are firstly archived as a single file in
the pre-transmission phase, and then transferred through network. After the archived data
reach the destination data node, the data should be un-archived on destination data node
and then be written to disk in the after-transmission phase.
In Compressed Transmission (CT for short), data are firstly compressed into a single file
inthepre-transmissionphase, thentransferredthroughnetwork, andfinallybedecompressed
and written to disk at the destination data node in the after-transmission phase.
Data compression has been claimed as an efficient solution to save energy consumption
in high-end servers and data centers [68]. Compared with the direct data transmission,
compressed data transmission leads to smaller volume of data transformed through network.
91
However, compressed data transmission generates extra workload on CPU which drive the
CPU working under a relative high utilization.
In data centers of current business companies (e.g., Google, Amazon, Facebook), there
are more download data transmission than upload data transmission. Download process is
composed of transferring data from data nodes in data centers to customer clients. So we
focus on reducing energy cost of data transmission from data centers perspective.
7.2 Preliminary Results
To characterize the overall energy cost of data transmissions over network interconnec-
tions, we start this study by investigating the performance and thermal behaviours of various
data transmission strategies. In this section, we first describe a testbed and three data trans-
mission methods used in our preliminary experiments. Next, we conduct the experiments on
two real datasets and illustrate thermal impacts made by these three strategies. Finally, we
demonstrate the motivation of our predictive energy-aware management for storage systems.
ThetestbedconsistsoftwoLinuxserversconnectedthroughthefastEthernet. Table7.1
summarizes the configuration details of the servers performing as nodes of a storage cluster.
In the experiment, CPU and disk temperatures are collected from embedded device sensors.
Theinletandoutlettemperaturesofthestoragenodesaremonitoredbyfoursensorsattached
to the nodes.
Table 7.1 Testbed Configuration for Data Transmission
Node 1 Node 2
CPU Intel(R) Celeron(R) 450@2.2GHz
Network 1 GigaBit Ethernet network card
Disk WD-500GB Sata disk( [8]) WD-160GB Sata disk( [7])
Operating Ubuntu 10.04(lucid) Ubuntu 10.04(lucid)
System Linux kernel 2.6.32-43 Linux kernel 2.6.32-38
92
We transfer two real-world datasets between the two storage nodes, the results of which
are presented as following. Three data-transmission strategies (DT, AT, and CT) are exam-
ined in this preliminary experiment.
Transferring A Single Text File
In the first group of experiments, we apply the above three strategies to transfer a single
text file of 507.7 MB from node 1 to node 2.
0 10 20 30 40 50 600
20
40
60
80
100
Time(\sec)
Utilization(%)
 
 
30
34
38
42
46
50
Temperature(
?C)
cpu?util cpu?tempdisk?util disk?temp
(a) Node 1 in direct transmission.
0 10 20 30 40 50 600
20
40
60
80
100
Time(\sec)
Utilization(%)
 
 
30
34
38
42
46
50
Temperature(
?C)
cpu?util cpu?tempdisk?util disk?temp
(b) Node 2 in direct transmission.
Figure 7.1: Performance of transferring 1 text file in direct transmission.
93
Fig. 7.1, 7.2, and 7.3 display the temperature and utilization of CPUs and disks
during the data transmission of a large text file. We observe that the execution times of
DT and AT are very close; however, CT is an outlier doubling the execution time of both
DT and AT. Regardless of the methods, CPU temperatures significantly increase, whereas
disk temperatures stay unchanged. Constant disk temperatures are reasonable because disks
have relatively longer heat-up periods (i.e., 30 minutes) [59]. Staying in the active state for a
short period (e.g., less than one minute) has no significant impact on the disk temperature.
0 10 20 30 40 50 600
20
40
60
80
100
Time(\sec)
Utilization(%)
 
 
30
34
38
42
46
50
Temperature(
?C)
cpu?util cpu?tempdisk?util disk?temp
(a) Node 1 in archived transmission.
0 10 20 30 40 50 600
20
40
60
80
100
Time(\sec)
Utilization(%)
 
 
30
34
38
42
46
50
Temperature(
?C)
cpu?util cpu?tempdisk?util disk?temp
(b) Node 2 in archived transmission.
Figure 7.2: Performance of transferring 1 text file in archived transmission.
94
Figs. 7.1(a), 7.2(a), and 7.3(a) show that node 1?s CPU utilization and temperature
increase rapidly, whereas disk utilization remains at a low level. The CT scheme gives rise
to extremely high CPU utilization because the compression process is very computation
intensive. On the other hand, CT?s disk utilization is simply half of those of the other two
methods. DT and AT have similar CPU and disk utilizations.
0 10 20 30 40 50 600
20
40
60
80
100
Time(\sec)
Utilization(%)
 
 
30
34
38
42
46
50
Temperature(
?C)
cpu?util cpu?tempdisk?util disk?temp
(a) Node 1 in compressed transmission.
0 10 20 30 40 50 600
20
40
60
80
100
Time(\sec)
Utilization(%)
 
 
30
34
38
42
46
50
Temperature(
?C)
cpu?util cpu?tempdisk?util disk?temp
(b) Node 2 in compressed transmission.
Figure 7.3: Performance of transferring 1 text file in compressed transmission.
Figs. 7.1(b), 7.2(b), and 7.3(b) reveal that node 2?s CPU utilization is close to that of
node 1 under the DT and AT cases, except that node 2?s CPU utilization is only one fifth
of that of node 1 in the CT case. Thus, the CPU temperature of node 2 under DT is also
95
lower than those of the same node under the other two methods. For all the three strategies,
node 2 has lower disk utilization than node 1.
Table 7.2 summarizes the execution times and file size, as well as compression ratios.
The temperatures and utilizations of CPU and disks are also summarized in Table 7.2. In
this table, N1 and N2 represent node 1 and node 2, respectively. CT enjoys a compression
ratio of 21.9%; data is not compressed in the other two methods. DT exhibits the shortest
execution time among the three test strategies.
Table 7.2 Summary of Single Text File Transmission
Methods DT AT CTN1 N2 N1 N2 N1 N2
Execution Time(s) 17 17 18 20 42 47
AVG UCPU(%) 65.7 63.9 63.0 61.5 93.4 17.9
AVG UDisk(%) 20.3 65.0 19.3 55.9 6.8 19.0
MAX TCPU(uni2103) 47 48 47 48 49 43
MAX TDisk(uni2103) 33 33 33 33 33 33
Data Transferred(MB) 507.7 507.7 111.2
Compression Ratio(%) 100 100 21.9
Total Energy Cost(J) 4036.9 4459.2 9952.8
We observe that CT suffers from the highest CPU utilization on node 1 due to com-
pression overhead, whereas in node 2, CPU utilization is lower than those in the other two
methods. The peak CPU temperature of node 2 under the CT method is the lowest among
all the methods. The first two methods share similar thermal impact on the two nodes. By
comparing the overall energy cost of these three methods, we observe that DT is the most
energy-efficient approach. In short, we conclude that the archiving and compression process
leads to high CPU temperature and utilization, which in turn have noticeable impact on the
total energy cost in storage systems.
96
Transferring Source Code Files
We evaluate a second case where Linux source code files are transferred between two
storage nodes. Fig. 7.4, 7.5, and 7.6 reveal temperatures and utilizations of CPUs and disks
where the three data transfer strategies are adopted.
0 20 40 60 80 100 1200
20
40
60
80
100
Time(\sec)
Utilization(%)
 
 
30
34
38
42
46
50
Temperature(
?C)
cpu?util cpu?tempdisk?util disk?temp
(a) Node 1 in direct transmission.
0 20 40 60 80 100 1200
20
40
60
80
100
Time(\sec)
Utilization(%)
 
 
30
34
38
42
46
50
Temperature(
?C)
cpu?util cpu?tempdisk?util disk?temp
(b) Node 2 in direct transmission.
Figure 7.4: Performance of transferring Linux kernel files in direct transmission.
FortransferringtheLinuxkernelfileswithdirecttransmission(DT),asshowninFig.7.4,
the time to finish the data transmission is a little longer than 80 seconds. Both data nodes
have relatively low CPU utilization for the entire transmission procedure, with CPU utiliza-
tion of data node 1 is between 10% to 40% and of node 2 is between 20% to 30%. Besides
97
that, on data node 1, the CPU utilization in the first 40 seconds is about 10% higher than
the following 40 seconds. And for data node 2, we could observe the same trend for CPU
utilization. However, for disk, node 1 maintains a utilization between 20% to 40% and node
2 maintains a widely distributed utilization (from 0% to 100%). The temperature of these
two data nodes are also different: the CPU temperature of data node 1 reaches 46uni2103, while
for data node 2, its CPU temperature is heated up to 44uni2103.
0 20 40 60 80 100 1200
20
40
60
80
100
Time(\sec)
Utilization(%)
 
 
30
34
38
42
46
50
Temperature(
?C)
cpu?util cpu?tempdisk?util disk?temp
(a) Node 1 in archived transmission.
0 20 40 60 80 100 1200
20
40
60
80
100
Time(\sec)
Utilization(%)
 
 
30
34
38
42
46
50
Temperature(
?C)
cpu?util cpu?tempdisk?util disk?temp
(b) Node 2 in archived transmission.
Figure 7.5: Performance of transferring Linux kernel files in archived transmission.
Transferring the Linux kernel files with archived transmission (AT), as shown in Fig. 7.5
cost about only 40 seconds, which is half of the time to transferring these files with using
98
direction transmission (DT). This dues to the archival process reduces the data size to be
transferred through the network. Different from using DT, CT results in high disk utilization
on data node 1 (from 30% to 90%) and higher average disk utilization on data node 2 (from
20% to 100%). The CPU utilizations on both data nodes seem to be almost the same as
using DT strategy.
0 20 40 60 80 100 1200
20
40
60
80
100
Time(\sec)
Utilization(%)
 
 
30
34
38
42
46
50
Temperature(
?C)
cpu?util cpu?tempdisk?util disk?temp
(a) Node 1 in compressed transmission.
0 20 40 60 80 100 1200
20
40
60
80
100
Time(\sec)
Utilization(%)
 
 
30
34
38
42
46
50
Temperature(
?C)
cpu?util cpu?tempdisk?util disk?temp
(b) Node 2 in compressed transmission.
Figure 7.6: Performance of transferring Linux kernel files in compressed transmission.
As shown in Fig. 7.6, the time cost to transfer the same data files is about 60 seconds
with compressed transmission (CT) strategy. The CPU utilizations in this experiment is
totally different from the previous two experiments. We observe a very high CPU utilization
99
(from 50% to 80%) on data node 1 during the data transmission, and the CPU temperature
is heated up to 48uni2103. On data node 2, though the CPU utilization remains between 10% and
20%, the CPU temperature is also higher than using the other two transmission strategies.
The disk utilization of data node 1 is between 30% and 60%. While for data node 2, disk
utilization is from 0% to 100%. Compare with using DT strategy, the disk utilization of data
node 2 is more concentrated on higher value.
Table 7.3 Empirical Results of Transferring Linux Source Code Files
Methods DT AT CTN1 N2 N1 N2 N1 N2
Execution time(s) 81 90 40 60 49 57
AVG UCPU(%) 20.8 16.1 24.2 17.4 68.6 15.7
AVG UDisk(%) 27.4 23.0 56.7 69.1 45.9 61.5
MAX TCPU(uni2103) 46 44 46 44 48 46
MAX TDisk(uni2103) 33 32 32 33 33 32
Data Transferred(MB) 454.8 475.8 103.8
Compression Ratio(%) 100 100 23
Total Energy Cost(J) 16164 15938 16718
For better comparison, we summarize the detailed results as shown in Table 7.3. We
observe from the table that AT achieves the best performance in terms of execution time.
The CT scheme only transfers 103.8 MB of data, which is 23% of the original data size,
over the network. However, CT does not exhibit the shortest transmission time due to extra
overhead caused by data compression and decompression. When it comes to the AT method,
even the size of data transferred over the network is larger than that of DT; the transmission
time of AT is much shorter than that of DT. This performance trend is reasonable because
the Linux kernel package contains a large number (i.e., 40,927) of small files. Transferring
these small files one by one takes a long time due to network latencies. Merging small files
into a single large file helps to reduce the network overhead.
Like findings obtained from the first group of experiments, the compression process
results in the highest CPU temperature and utilization in the case of CT. Although the
100
peak disk temperature is different from that observed in the first experiment, the peak
temperature remains unchanged in all the methods during the execution period. From the
thermal behaviour?s perspective, DT and AT are more thermal friendly than CT. From the
energy?s perspective, AT consumes less energy than the other two strategies.
Motivation of the Predictive Thermal-aware Management
The above preliminary findings suggest that it is challenging to accurately estimate
energy costs of data transmissions due to the following three reasons. First, the total energy
cost (including computing and cooling costs) caused by data transmissions depends on CPU
and disk temperatures, transmission times, and compression ratios. Second, there is a lack
of energy-efficient data-transfer strategies that can fit the needs of a wide range of cases. The
DT scheme can energy efficiently transfer a single large text file (see Section 7.2); whereas AT
is the most energy-efficient strategy to transfer a large number of small files (see Section 7.2).
The impact of data compression on energy consumption largely relies on the features of files
being transferred. Third, data transmissions occur frequently in cluster storage systems. It
is impractical to manually choose the best data-transfer strategy in a dynamic computing
environment, where the features of transferred files are continually changing. Automatically
selecting an appropriate method is critical to save energy on data transmissions.
To address this problem, we design a predictive thermal-aware management system or
PTMS. There are two phases incorporated in PTMS. The first phase is to predict energy
consumption incurred by executing each candidate data-transfer strategy. Predictions are
obtained by comprehensively considering compression ratios, transmission times, file types,
and data sizes. The second phase is a straightforward selection made by comparing the pre-
dicted energy costs induced by the candidate strategies. The details on PTMS are illustrated
in the next section.
101
7.3 Framework of Predictive Thermal-aware Management System
Fig. 7.7 shows the framework of predictive thermal-aware management system (PTMS
for short). It displays a storage system equipped with n data nodes. The PTMS is applied
on each node. The monitor module gathers runtime parameters related to data transmis-
sions, file metadata, and storage nodes (e.g., temperatures and utilizations). When a data
transmission request is detected, the module forwards the request to the method selector,
which chooses a thermal friendly data-transfer strategy for the transmission.
Transm issi on 
Re quests
...
Methods Runtime
Da ta
Node 1
Monitor
Node j
Monitor
Node i
Monitor Monitor... ...
Node n
Runtime
Da ta
Method Selector Energy Predictor
Prediction 
Re quests
Energy Costs
Figure 7.7: The framework of the predictive thermal-aware management system.
The Method Selector not only maintains candidate data-transfer strategies, but it also
judiciously chooses the best strategy to reduce thermal impact and the energy cost. Fig. 7.7
shows that upon the arrival of a data-transmission request, the Method Selector forwards
the request along with all the candidate strategies to the energy predictor. According to an
energy estimate offered by the predictor, the Method Selector notifies the monitor module
of a candidate strategy that will cause the lowest energy cost to transfer the data.
The Energy Predictor, shown in Fig. 7.8, provides the energy estimates of data trans-
missions handled by a particular strategy. In our predictive thermal management system,
the predictor is focused on the energy consumption (including both computing energy cost
and cooling cost) of data nodes in the storage system. So, before estimate the energy cost,
102
the data transmission type should be determined. There are mainly three types of data
transmission from the perspective of data centers: upload, download and data transmission
in data centers. The performance models and energy models proposed in PEAM system are
used [60].
Performance 
Model
Utilization & 
Execution Time
Thermal 
Model
COP 
Model
Computing 
Energy Model
Utilization Toutlet
Cooling 
Cost
Computing 
Cost
Total Energy 
Cost
Energy Predictor Model1. Network 
    Bandwidth
2. Dataset Size
3. Method
4. Compression 
    Ratio
5. Transmission  
    Type
Figure 7.8: Framework of the energy predictor module.
7.3.1 Performance Model
The performance model derives CPU/disk utilization and data-transmission time from
the information provided by prediction requests; such information includes network band-
width,datasetsize,datatransmissionmethods,andcompressionratios. Compressionschemes
and their compression ratios for given file types are maintained in the model as a static
data structure. The execution time of a data-transmission process is made up of data
transmission time and compression/decompression time if it is applicable. The compres-
sion/decompression time is determined by data size and compression methods. If a data-
transmissionstrategydoesnotapplydatacompressiontechniques,thecompression/decompression
time should be ignored. Obviously, data compression overhead might be offset by time saved
in transferring data over the network.
103
The utilization of CPUs and disks can be derived as a function of Method (i.e., a
data-transfer method) and Rcompression (i.e., compression ratio). Thus, we have
UCPU = g(Method;Rcompression); (7.1)
Udisk = h(Method;Rcompression); (7.2)
where UCPU and Udisk are average CPU and disk utilizations.
We express the execution time of a data-transmission process as:
Texecution =k(size;Method;Rcompression;Bandwidth)
=Tread +TMethodpre proc +Tsend
+Treceive +TMethodafter proc +Twrite
(7.3)
wheresize,Rcompression, andBandwidthdenotethedatasize, compressionratio, andnetwork
bandwidth. Texecution is the execution time if Method is applied to transfer the data. Tread
is the time spent in reading the original file to cache on the source node, and Tread depends
on the size value. TMethodpre proc is the time of pre-processing the data with a specific method; for
example, with the DT method, the data should be compressed in the source node?s cache.
TMethodafter proc is the time of processing the transferred data (e.g., decompression). Tsend and
Treceive are sending and receiving times of the data delivered over the network; Tsend and
Treceive are affected by Bandwidth and Rcompression. Twrite is the time spent in writing the
received data to a destination disk.
7.3.2 Thermal Model
The thermal model estimates outlet temperatures of a storage node based on its CPU
and disk utilizations. CPU temperatures, which are sensitive to CPU utilization, can be
expressed as:
TCPU(t) = fCPU(TCPUi ;TA;UCPU;t); (7.4)
104
where TCPUi and TA denote initial CPU temperature and ambient temperature. UCPU rep-
resents CPU utilization, and t is the CPU running time under a specific utilization.
Differing from CPU temperatures, disk temperatures are not noticeably sensitive to disk
utilizations during a short period of time. However, if a disk is active for a longer period,
the disk?s temperature is affected by its utilization. The disk temperature can be modelled
as:
Tdisk(t) = fdisk(Tdiski ;TA;Udisk;t); (7.5)
where Tdiski and TA are initial disk temperature and ambient temperature. Udisk represents
disk utilization. t is the time that disk works in active state.
Since CPU and disk are two major contributors to outlet temperatures of storage nodes,
weusetheoutlettemperaturemodelproposedinthepreviouschaptertoquantifythethermal
impact of CPU and disk activities on outlet temperatures.
Toutlet = Tinlet +a+b TCPU +c Tdisk; (7.6)
where Tinlet and Toutlet are the inlet and outlet temperatures of a storage node. a represents
the impact of other components on the outlet temperature, b is the thermal impact from
CPU temperatures, and c is the impact from disk temperatures.
7.3.3 Computing Energy Power Model
We use (7.7) to calculate the computing energy power, where Pi is the power of a
component that is sitting idly,Ucomponent refers to the utilization of the component in storage
nodes. Pmaxcomponent and Pidlecomponent are the power when the component works in full capacity
and is in the idle state, respectively.
PC = Pi +  (Ucomponent (Pmaxcomponent Pidlecomponent)) (7.7)
105
With the computing cost PC and cooling power PAC(see Chapter 3) in place, we can
express the overall power as:
PTotal = PC +PAC; (7.8)
7.4 Results
Massive amount of data are uploaded to and downloaded from data centers. For in-
stance, 72 hours of videos are uploaded to Youtube every minute; 350GB data are uploaded
to Facebook every minute; 15,000 tracks are downloaded from iTune every minute [25]. Up-
loading and downloading a large amount of data consume considerable energy and time;
even worse, energy cost of data centers is rising dramatically with the increasing amount of
data.
To evaluate the energy efficiency of our predictive thermal-aware management system
designed for data centers, we conduct two sets of experiments.
In the first group of experiments, a pair of data nodes are transferring a dataset that
contains hundreds of ASCII files generated by Postmark. The dataset?s size is 1GB; the file
size of each is anywhere between 1M to 100M. Among all the transferred files, small files are
accessed more frequently than large files. It is important to study the energy consumption
caused by transferring small files. For example, a report shows that there are 500 millions
of files saved every 48 hours on Dropbox as of May, 2012 [40]. A majority of Dropbox users
use their free space to store small files. In most cases, uploaded files to the Dropbox servers
are small in size.
We compare the performance of the four data transmission strategies (i.e., DT, AT, CT,
and PTMS) transferring the two datasets. Fig. 7.9 shows the energy cost of Node 1 that
transfers the first dataset to Node 2. We observe that compared to the other strategies, AT
consumes less energy for both nodes 1 and 2 when the ASCII files are transmitted. CT is
the least energy-efficient scheme among all the evaluated strategies.
106
DT AT CT0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2x 104
Transmission Strategies
Energy Cost (J)
 
 Node 1
Node 2
Figure 7.9: Energy cost of data nodes in transferring the ASCII files.
Now we are in a position to evaluate the energy efficiency of our PTMS. Fig. 7.10 shows
the energy cost of the four strategies under different transmission types. Not surprisingly,
CT consumes more energy transferring this dataset than the other strategies. This is mainly
becausedatacompressionor/anddecompressioncostextraCPUtimeandenergy. Regardless
of the transmission types, PTMS is the best one among all the tested strategies.
DT AT CT PTMS0
0.5
1
1.5
2
2.5
3
3.5x 104
Transmission Strategies
Energy Cost (J)
 
 upload
downloadmovement
Figure 7.10: Energy cost of transferring the ASCII files under different transmission types.
To resemble real-world cases where large files are transferred, in the second experiment
group we choose to use a dataset of 60 GB Human Genome sequences. This dataset is
107
availableatNIH?s(NationalInstitutesofHealth)NCBIwebsite1. Eachsequencefilecontains
the DNA sequence of an entire chromosome. Most of the files in this dataset are larger than
3GB.
DT AT CT0
5
10
15x 105
Transmission Strategies
Energy Cost (J)
 
 Node 1
Node 2
Figure 7.11: Energy cost of data nodes in transferring the Human Genome dataset.
DT AT CT PTMS0
0.5
1
1.5
2
2.5x 106
Transmission Strategies
Energy Cost (J)
 
 upload
downloadmovement
Figure 7.12: Energy cost of transferring the Human Genome dataset under different trans-
mission types.
Fig. 7.11 shows the energy incurred by transferring the Human Genome dataset between
nodes 1 and 2. Fig. 7.12 depicts the energy cost of transferring the Human Genome dataset
1ftp://ftp.ncbi.nih.gov/genomes/H_sapiens
108
with the four strategies under different transmission type. We observe that regardless of
data nodes, AT and PTMS outperform the other two strategies. The experimental results
suggest that PTMS noticeably conserves energy for all the other three data transmission
types.
7.5 Summary
Surprisingly high energy consumption of data centers makes it demanding to improve
energy efficiency of large-scale storage systems. In modern data centers, data management
introduces big data operations to achieve high I/O performance by judiciously placing files.
Big data operations can incur both performance and energy overheads due to frequent data
movement. We aim to reduce the energy costs of data centers by offering an energy-aware
data management strategy to improve energy efficiency of data storage systems.
In this chapter, we first characterized the thermal and performance behaviours of three
data transmission methods. A thermal-aware data transmission strategy is proposed, where
data transmission is divided into three camps: uploads, downloads, and migrations within a
data center. We implemented the thermal-aware data transmission strategy in a predictive
thermal-aware management system or PTMS, which is conducive to estimating data nodes?
energy consumption that guides the management of data transmissions. Among all the
candidate data transmission policies, PTMS dynamically chooses the most appropriate one
that meets the needs of a wide range of data-intensive applications coupled with various
data transmission patterns. Our experimental results show that our system performs better
than simply selecting any one among the candidate methods for data transmission in terms
of energy efficiency.
109
Chapter 8
Conclusion
In this dissertation, we demonstrated a thermal modeling approach that investigates
thermal impacts of both CPUs and disks in data nodes. Then, the model is used to estimate
the outlet temperatures of data nodes based on CPU and disk utilization. In addition, we in-
corporated our thermal models into thermal management strategies, which make data nodes
thermal and energy friendly. The first strategy is integrated into a scheduler to dispatch and
redistribute I/O tasks in a way to ensure that all the data nodes? outlet temperatures are
lower than a predetermined threshold. Following are a two-stage data placement strategy in
homogeneous data storage systems and a SSD-first data placement strategy in hybrid storage
systems. The last one is a thermal-aware data transmission strategy, where data transfers
are divided into three camps: uploads, downloads, and migrations within a data center. We
implemented the thermal-aware data transmission strategy in a predictive thermal-aware
management system or PTMS, which is conducive to estimating data nodes? energy con-
sumption that guides the management of data transmissions. Among all the candidate data
transmission policies, PTMS dynamically chooses the most appropriate one that meets the
needs of a wide range of data-intensive applications coupled with various data transmission
patterns.
8.1 Main Contributions
Energy consumption of data centers has increased dramatically in recent years. Com-
puting costs of IT facilities and cooling costs of air conditioner systems contribute a large
portion of the total energy consumption of data centers. There are urgent needs to build
energy-efficient data centers; growing attention has been paid to reducing cooling costs of
110
data centers. The temperatures of data nodes in data centers have been identified as key
factors to cooling costs. Thus, modeling the temperatures of data nodes plays an important
role in estimating their energy consumption, which could be used to guide the development
of energy-efficient workload management.
8.1.1 Thermal Modeling of Disk Temperatures
Thermal behaviors of disks are not fully studied, and disks have not been taken into
account as an important contributor to outlet temperatures of data nodes. My preliminary
experimental results show that disks make noticeable impacts to outlet temperatures (i.e.,
deploying an additional disk contributes 0.3 uni2103 to the outlet temperature). In addition to
traditional hard drives, solid-state disks (SSD) are investigated in my dissertation research.
Compared with hard disk drives, solid-state disks have higher read/write performance and
lower energy consumption. Solid-state disks are more temperature-sensitive to disk activities
than hard drives. We proposed a thermal model that incorporates both hard drives and
solid-state disks.
8.1.2 Thermal Modeling of CPU Temperatures
Thermal behavior of CPU has been studied a lot; however, previous research models the
CPU temperatures in a fine-grained level. For instance, deep research is conducted to study
how micro-architecture would impact the thermal behaviors of processors. In addition, CPU
frequency and voltage are also considered as important contributions to CPU temperature.
We investigate the thermal characters of processors in a coarse-grained level by considering
the utilizations. Relations between CPU utilization, temperature, and energy consumption
have been built. We implemented two types of models, namely the polynomial and logarith-
mic models, to predict CPU temperatures. Experimental results show that the logarithmic
models (with the precision error less than 1%) have better performance than the polynomial
models.
111
8.1.3 Thermal Modeling and Energy Consumption of Data Nodes
Modeling the temperature of data nodes is a critical step prior to energy consumption
estimation of data nodes, especially the cooling cost for data centers. With the thermal mod-
els in place, outlet temperatures of data nodes can be predicted under particular workloads
without deploying temperature sensors. Combining these models enables administrators to
set up an appropriate supply temperature, which substantially reduces cooling cost of data
centers. Cooling cost is derived from computing cost of data nodes and the COP (Coefficient
of Performance) model, which is a function of cooling systems? supply temperatures. The
total energy cost of a data center is the summation of its computing cost and cooling cost.
8.1.4 Thermal-aware Task Scheduling System
Dispatching tasks plays a significant important role in load balancing and reducing the
energy consumption of data storage systems. Conventional task scheduling strategies dis-
tribute tasks for decreasing the computing cost of data nodes in storage systems. New trends
are brought up by considering the reduction of cooling cost of data nodes. With energy con-
sumption of data nodes estimated by the thermal models, We proposed a task scheduling
strategy, which keeps the outlet temperatures of data nodes well balanced. My task schedul-
ing strategy not only selects the best data node that the task should be assigned to in terms
of total energy costs of storage systems, but also ensures that the outlet temperatures of
data nodes do not exceed a pre-determined threshold, which protects computing resources
from working in a high temperature environment.
8.1.5 Data Placement in Homogeneous Disk Arrays
Evidence has shown that disks have non-negligible impacts on data nodes. In modern
data centers, a single data node usually supports multiple disks. For instance, a Teradata
equipment is able to house more than 100 disks. Data placement can significantly affect
the thermal behaviors of data nodes. The thermal impacts and energy consumption of data
112
nodes with various data placement schemes motivated me to build a new data placement
strategy. My data placement strategy contains two-stage schemes: in the initial stage, data
are distributed evenly inside the data center; and then in the redistribution stage, data are
migrated according to outlet temperature distributions.
8.1.6 Data Placement in Hybrid Cluster Storage Systems
Hybrid cluster storage systems could be classified into two categories: inter-node and
intra-node hybrid systems. In an intra-node hybrid cluster storage system, each node con-
tains both solid state disks (SSDs) and hard drive disks (HDDs). In an inter-node hybrid
system, some nodes are equipped with SSDs while others are comprised of HDDs. The per-
formances and thermal behaviors of hard drive disks and solid state disks are explored, and
SSD-first strategy is proposed to minimize the negative thermal impacts of hybrid storage
clusters.
8.1.7 Predictive Thermal-aware Management System (PTMS)
By investigating the thermal impacts and energy consumption of applying several po-
tential data transmission strategies, We developed the PTMS system that chooses the most
energy-efficient data transmission strategy for data storage systems. PTMS is composed of
three components: an energy cost predictor, a method selector, and monitors. The energy
cost predictor estimates the energy consumption of data transmission by giving the size of
data to be transferred, compression ratio, bandwidth of network, and the like. The method
selectorchoosesthebestdatatransmissionmethodintermsofenergyefficiency. Themonitor
collects run-time information of each data node.
113
8.2 Future Work
8.2.1 Considering Ambient Temperatures
Ambient temperatures are a major factor affecting disk temperatures. Modeling the
impact of ambient temperatures on disk temperatures is still in its infancy. I plan to conduct
experiments to study the thermal behavior of disks with different workload conditions under
various ambient temperatures. Besides postmark, I will also consider running continuous
read and write benchmarks to study disk thermal behaviour.
8.2.2 Data Storage Nodes Equipped with Multiple Disks
My preliminary findings suggest that deploying an additional disk leads to an increment
of about 0.3 uni2103 of outlet temperatures. Each of my current tested node houses no more
than four disks. Real-world data nodes may be equipped with more than 64 disks. I will
further investigate the impacts of the large number of disks on outlet temperatures.
8.2.3 Heterogeneous Data Centers
My current research focuses on task scheduling and data placement on homogeneous
data centers. With the rapid development of technology, heterogeneous data centers are
becoming popular. When a data center is expanded, new instruments are deployed, which
makes the data center heterogeneous in nature. I intend to design new scheduling and data
placement algorithms tailored for heterogeneous data centers.
8.2.4 Thermal Models for Hadoop Clusters
Hadoop clusters, which support the processing of large data sets in a distributed com-
puting environment, have been widely used in modern data. Hadoop enables the distribution
of workload among thousands of data nodes with continuous operation even if some of the
data nodes fail. Each data file in the Hadoop system has three replicas. I plan to develop
114
new thermal models to capture thermal behaviors of Hadoop clusters. My new model will
incorporate various data placement strategies designed for Hadoop clusters.
8.2.5 Energy-aware Hadoop Distributed File System
I have investigated thermal-aware data transmission inside a data center. In the future,
I plan to study the energy-efficient data management in Hadoop clusters. In the Hadoop
system, there are usually three replicas for each data block. Thus, when a file is imported
into the Hadoop distributed file system or HDFS, three copies of the file are created. I will
develop an energy-efficient HDFS, which can manage replicas in a way which reduces energy
consumption. In addition, I will develop an energy-efficient data transmission mechanism to
efficiently transfer massive amounts of data between clients and HDFS.
8.2.6 Address Big Data Challenges
Big Data is a collection of large and complex data sets that are difficult to be processed
by traditional data management tools. My long-term goal is to address big data challenges
such as data processing, storage, and transferring. Among a wide variety of big data appli-
cations, I will be focusing on genomics and biological research. I plan to start this research
by investigating two genomics and bioinformatics applications running on a Hadoop cluster.
These applications are drivers for my future parallel computing studies that are focused on
data analytics. Data placement of massive amounts of data will be addressed in my future
research while these data-intensive applications are being developed.
8.2.7 Security Issue of Data Storage Systems
A traditional method to ensure the security of data is encryption. However, there is a
new trend that hackers send continuous requests to data servers to make these servers extra
hot until there are down. I plan to conduct research by applying thermal-aware management
115
strategies to distribute the workload and control the responses to user requests to ensure the
security of data servers in data storage systems.
116
Bibliography
[1] hddtemp. http://manpages.ubuntu.com/manpages/natty/man8/hddtemp.8.html.
[2] Intel ssd sa2m080g2gc. http://download.intel.com/newsroom/kits/ssd/pdfs/
X25-M_34nm_DataSheet.pdf.
[3] lm-sensors. http://www.lm-sensors.org/.
[4] Minigoose-ii. http://www.itwatchdogs.com/datasheets/MiniGoose_II_User_Man
ual_v1_05.pdf.
[5] stress. http://www.unixref.com/manPages/stress.html.
[6] Temperature sensor. http://www.itwatchdogs.com/datasheets/Temperature%20S
ensor%20datasheet%20(v1.06).pdf.
[7] Wd1600aajs specification. http://www.wdc.com/wdproducts/library/SpecSheet/E
NG/2879-701277.pdf.
[8] Wd5000aaks specification. http://www.wdc.com/wdproducts/library/SpecSheet/E
NG/2879-701277.pdf.
[9] Whetstone. http://www.netlib.org/benchmark/whetstones.
[10] Globaldatacenterenergydemandforecasting. Technicalreport, DatacenterDynamics,
2011.
[11] What happens on facebook in each day? http://visual.ly/what-happens-faceb
ook-each-day, 2012.
[12] Accenture and WSP. Cloud computing and sustainability: The environmental benefits
of moving to the cloud. Technical report, Accenture, 2010.
[13] B. Aksanli, J. Venkatesh, L. Zhang, and T. Rosing. Utilizing green energy prediction
to schedule mixed batch and service jobs in data centers. SIGOPS Oper. Syst. Rev.,
45(3):53?57, Jan. 2012.
[14] M.AlAssaf, X.Jiang, M.Abid, andX.Qin. Eco-storage: Ahybridstoragesystemwith
energy-efficient informed prefetching. Journal of Signal Processing Systems, 72(3):165?
180, 2013.
[15] A. R. Alameldeen and D. A. Wood. Adaptive cache compression for high-performance
processors. SIGARCH Comput. Archit. News, 32(2):212?, Mar. 2004.
117
[16] M. Allalouf, Y. Arbitman, M. Factor, R. I. Kat, K. Meth, and D. Naor. Storage mod-
eling for power estimation. In Proceedings of SYSTOR 2009: The Israeli Experimental
Systems Conference, SYSTOR ?09, pages 3:1?3:10, New York, NY, USA, 2009. ACM.
[17] R. Ayoub, K. Indukuri, and T. Rosing. Temperature aware dynamic workload schedul-
ing in multisocket cpu servers. Computer-Aided Design of Integrated Circuits and
Systems, IEEE Transactions on, 30(9):1359?1372, Sept 2011.
[18] R. Ayoub, R. Nath, and T. Rosing. Jetc: Joint energy thermal and cooling man-
agement for memory and cpu subsystems in servers. In High Performance Computer
Architecture (HPCA), 2012 IEEE 18th International Symposium on, pages 1?12, Feb
2012.
[19] R. Ayoub, R. Nath, and T. S. Rosing. Cometc: Coordinated management of en-
ergy/thermal/cooling in servers. ACM Trans. Des. Autom. Electron. Syst., 19(1):1:1?
1:28, Dec. 2013.
[20] M. Baile. The economics of virtualization: Moving toward an application-based cost
mode. Technical report, VMware, November 2009.
[21] M. Balakrishnan, A. Kadav, V. Prabhakaran, and D. Malkhi. Differential raid: Re-
thinking raid for ssd reliability. Trans. Storage, 6(2):4:1?4:22, July 2010.
[22] A. Banerjee, T. Mukherjee, G. Varsamopoulos, and S. K. S. Gupta. Cooling-aware
and thermal-aware workload placement for green hpc data centers. In Proceedings of
the International Conference on Green Computing, GREENCOMP ?10, pages 245?256,
Washington, DC, USA, 2010. IEEE Computer Society.
[23] A. Beloglazov, J. Abawajy, and R. Buyya. Energy-aware resource allocation heuristics
for efficient management of data centers for cloud computing. Future Generation Com-
puter Systems, 28(5):755 ? 768, 2012. <ce:title>Special Section: Energy efficiency in
large-scale distributed systems</ce:title>.
[24] A.BeloglazovandR.Buyya. Energyefficientresourcemanagementinvirtualizedcloud
data centers. In Proceedings of the 2010 10th IEEE/ACM International Conference on
Cluster, Cloud and Grid Computing, CCGRID ?10, pages 826?831, Washington, DC,
USA, 2010. IEEE Computer Society.
[25] S. Bennett. What happens on line in 60 seconds? http://www.mediabistro.com/al
ltwitter/online-60-seconds_b46813, July 25, 2013.
[26] W. L. Bircher and L. K. John. Complete system power estimation using processor
performance events. IEEE Trans. Comput., 61(4):563?577, Apr. 2012.
[27] T. Bostoen, S. Mullender, and Y. Berbers. Power-reduction techniques for data-center
storage systems. ACM Comput. Surv., 45(3):33:1?33:38, July 2013.
118
[28] D. Brooks, V. Tiwari, and M. Martonosi. Wattch: A framework for architectural-level
power analysis and optimizations. SIGARCH Comput. Archit. News, 28(2):83?94, May
2000.
[29] A. Cannane and H. E. Williams. A general-purpose compression scheme for large
collections. ACM Trans. Inf. Syst., 20(3):329?355, July 2002.
[30] L.-P. Chang. Hybrid solid-state disks: combining heterogeneous nand flash in large
ssds. InProceedings of the 2008 Asia and South Pacific Design Automation Conference,
ASP-DAC ?08, pages 428?433, Los Alamitos, CA, USA, 2008. IEEE Computer Society
Press.
[31] Y.-H. Chang, C.-K. Hsieh, P.-C. Huang, and P.-C. Hsiu. A caching-oriented manage-
ment design for the performance enhancement of solid-state drives. Trans. Storage,
8(1):3:1?3:21, Feb. 2012.
[32] C.-H. Chao, K.-Y. Jheng, H.-Y. Wang, J.-C. Wu, and A.-Y. Wu. Traffic- and thermal-
aware run-time thermal management scheme for 3d noc systems. In Networks-on-Chip
(NOCS), 2010 Fourth ACM/IEEE International Symposium on, pages 223?230, May
2010.
[33] F. Chen, J. Grundy, J.-G. Schneider, Y. Yang, and Q. He. Automated analysis of
performance and energy consumption for cloud applications. In Proceedings of the 5th
ACM/SPEC International Conference on Performance Engineering, ICPE ?14, pages
39?50, New York, NY, USA, 2014. ACM.
[34] F. Chen, J. Grundy, Y. Yang, J.-G. Schneider, and Q. He. Experimental analysis of
task-based energy consumption in cloud computing systems. In Proceedings of the 4th
ACM/SPEC International Conference on Performance Engineering, ICPE ?13, pages
295?306, New York, NY, USA, 2013. ACM.
[35] F.Chen, D.A.Koufaty, andX.Zhang. Hystor: makingthebestuseofsolidstatedrives
in high performance storage systems. In Proceedings of the international conference on
Supercomputing, ICS ?11, pages 22?32, New York, NY, USA, 2011. ACM.
[36] K.-C. Chen, S.-Y. Lin, and A.-Y. Wu. Design of thermal management unit with
verticalthrottlingschemeforproactivethermal-aware3dnocsystems. InVLSI Design,
Automation, and Test (VLSI-DAT), 2013 International Symposium on, pages 1?4,
April 2013.
[37] D. Chiu, C. Stewart, and B. McManus. Electric grid balancing through lowcost work-
load migration. SIGMETRICS Perform. Eval. Rev., 40(3):48?52, Jan. 2012.
[38] D. Colarelli and D. Grunwald. Massive arrays of idle disks for storage archives. In
Proceedings of the 2002 ACM/IEEE conference on Supercomputing, Supercomputing
?02, pages 1?11, Los Alamitos, CA, USA, 2002. IEEE Computer Society Press.
[39] T. G. Consortium. 7 strategies to optimize data centre cooling. http://www.biztec
hmagazine.com/article/2011/01/keep-your-cool/, Jan. 2011.
119
[40] J. Constine. Dropbox is now the data fabric tying together devices for 100m registered
users who save 1b files a day. http://techcrunch.com/2012/11/13/dropbox-100-m
illion/, Nov. 2012.
[41] G. Cook. How clean is your cloud? Technical report, Greenpeace International, April
2012.
[42] R. Das, A. Mishra, C. Nicopoulos, D. Park, V. Narayanan, R. Iyer, M. Yousif, and
C. Das. Performance and power optimization through data compression in network-on-
chip architectures. In High Performance Computer Architecture, 2008. HPCA 2008.
IEEE 14th International Symposium on, pages 215 ?225, feb. 2008.
[43] W. Deng, F. Liu, H. Jin, C. Wu, and X. Liu. Multigreen: Cost-minimizing multi-source
datacenterpower supplywith online control. InProceedings of the Fourth International
Conference on Future Energy Systems, e-Energy ?13, pages 149?160, New York, NY,
USA, 2013. ACM.
[44] P.Desnoyers. Analyticmodelsofssdwriteperformance. Trans. Storage, 10(2):8:1?8:25,
Mar. 2014.
[45] T. Diop, N. E. Jerger, and J. Anderson. Power modeling for heterogeneous processors.
In Proceedings of Workshop on General Purpose Processing Using GPUs, GPGPU-7,
pages 90:90?90:98, New York, NY, USA, 2014. ACM.
[46] P. Eibeck and D. Cohen. Modeling thermal characteristics of a fixed disk drive. Com-
ponents, Hybrids, and Manufacturing Technology, IEEE Transactions on, 11(4):566
?570, dec 1988.
[47] N. El-Sayed, I. A. Stefanovici, G. Amvrosiadis, A. A. Hwang, and B. Schroeder. Tem-
perature management in data centers: Why some (might) like it hot. SIGMETRICS
Perform. Eval. Rev., 40(1):163?174, June 2012.
[48] D. Essary and A. Amer. Sustainable predictive storage management: On-line grouping
for energy and latency reduction. In Proceedings of the 4th Annual International Con-
ference on Systems and Storage, SYSTOR ?11, pages 9:1?9:11, New York, NY, USA,
2011. ACM.
[49] A. Fanara, J. Abelson, A. Bailey, K. Crossman, R. Shudak, A. Sullivan, M. Vargas,
and M. Zatz. Report to congress on server and data center energy efficiency. Technical
report, U.S. Environmental Protection Agency, August 2007.
[50] K. P. Ganeshpure, I. Polian, S. Kundu, and B. Becker. Reducing temperature vari-
ability by routing heat pipes. In Proceedings of the 19th ACM Great Lakes Symposium
on VLSI, GLSVLSI ?09, pages 63?68, New York, NY, USA, 2009. ACM.
[51] A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel. The cost of a cloud: Research
problems in data center networks. SIGCOMM Comput. Commun. Rev., 39(1):68?73,
Dec. 2008.
120
[52] L. M. Grupp, J. D. Davis, and S. Swanson. The bleak future of nand flash mem-
ory. In Proceedings of the 10th USENIX Conference on File and Storage Technologies,
FAST?12, pages 2?2, Berkeley, CA, USA, 2012. USENIX Association.
[53] S. K. S. Gupta, A. Banerjee, Z. Abbasi, G. Varsamopoulos, M. Jonas, J. Ferguson,
R. R. Gilbert, and T. Mukherjee. Gdcsim: A simulator for green data center design
and analysis. ACM Trans. Model. Comput. Simul., 24(1):3:1?3:27, Jan. 2014.
[54] S. Gurumurthi, A. Sivasubramaniam, and V. K. Natarajan. Disk drive roadmap from
thethermalperspective: Acasefordynamicthermalmanagement. SIGARCH Comput.
Archit. News, 33(2):38?49, May 2005.
[55] A. Hammadi and L. Mhamdi. Review: A survey on architectures and energy efficiency
in data center networks. Comput. Commun., 40:1?21, Mar. 2014.
[56] J. Huang, F. Zhang, X. Qin, and C. Xie. Exploiting redundancies and deferred writes
to conserve energy in erasure-coded storage clusters. Trans. Storage, 9(2):4:1?4:29,
July 2013.
[57] I. E. Insights. Annual it spending by western european utilities to reach 12.7 billion
by 2017, Aug 2013.
[58] X. Jiang, M. Al Assaf, J. Zhang, M. Alghamdi, X. Ruan, T. Muzaffar, and X. Qin.
Thermal modeling of hybrid storage clusters. Journal of Signal Processing Systems,
72(3):181?196, 2013.
[59] X. Jiang, M. Alghamdi, J. Zhang, M. Assaf, X. Ruan, T. Muzaffar, and X. Qin.
Thermal modeling and analysis of storage systems. In Performance Computing and
Communications Conference (IPCCC), 2012 IEEE 31st International, pages 31?40,
2012.
[60] X. Jiang, J. Zhang, M. Alghamdi, X. Qin, M. Jiang, and J. Zhang. Peam: Predic-
tive energy-aware management for storage systems. In Networking, Architecture and
Storage (NAS), 2013 IEEE Eighth International Conference on, pages 105?114, July
2013.
[61] X. Jimenez, D. Novo, and P. Ienne. Ph&oelig;nix: Reviving mlc blocks as slc to extend
nand flash devices lifetime. In Proceedings of the Conference on Design, Automation
and Test in Europe, DATE ?13, pages 226?229, San Jose, CA, USA, 2013. EDA Con-
sortium.
[62] P. Jones. Industry census 2012: Emerging data center markets. https:
//www.datacenterdynamics.com/blogs/industry-census-2012-emerging-dat
a-center-markets, October 2012.
[63] J. Katcher. Postmark: A new file system benchmark. System, (3022):1?8, 1997.
121
[64] A. Kaur and S. Kinger. Temperature aware resource scheduling in green clouds. In Ad-
vances in Computing, Communications and Informatics (ICACCI), 2013 International
Conference on, pages 1919?1923, Aug 2013.
[65] Y. Kim, S. Gurumurthi, and A. Sivasubramaniam. Understanding the performance-
temperature interactions in disk i/o of server workloads. In High-Performance Com-
puter Architecture, 2006. The Twelfth International Symposium on, pages 176 ?186,
feb. 2006.
[66] R. Koller, L. Marmol, R. Rangaswami, S. Sundararaman, N. Talagala, and M. Zhao.
Writepoliciesforhost-sideflashcaches. InProceedings of the 11th USENIX Conference
on File and Storage Technologies, FAST?13, pages 45?58, Berkeley, CA, USA, 2013.
USENIX Association.
[67] J. Kong, S. W. Chung, and K. Skadron. Recent thermal management techniques for
microprocessors. ACM Comput. Surv., 44(3):13:1?13:42, June 2012.
[68] R. Kothiyal, V. Tarasov, P. Sehgal, and E. Zadok. Energy and performance evaluation
of lossless file data compression on server systems. In Proceedings of SYSTOR 2009:
The Israeli Experimental Systems Conference, SYSTOR?09, pages4:1?4:12, NewYork,
NY, USA, 2009. ACM.
[69] Y. Lee and A. Zomaya. Energy efficient utilization of resources in cloud computing
systems. The Journal of Supercomputing, 60(2):268?280, 2012.
[70] Y. C. Lee and A. Y. Zomaya. Energy efficient utilization of resources in cloud com-
puting systems. J. Supercomput., 60(2):268?280, May 2012.
[71] L. Li, C.-J. M. Liang, J. Liu, S. Nath, A. Terzis, and C. Faloutsos. Thermocast:
a cyber-physical forecasting model for datacenters. In Proceedings of the 17th ACM
SIGKDD international conference on Knowledge discovery and data mining, KDD ?11,
pages 1370?1378, New York, NY, USA, 2011. ACM.
[72] Z. Li, K. M. Greenan, A. W. Leung, and E. Zadok. Power consumption in enterprise-
scale backup storage systems. In Proceedings of the 10th USENIX Conference on File
and Storage Technologies, FAST?12, pages 6?6, Berkeley, CA, USA, 2012. USENIX
Association.
[73] J. Lin, H. Zheng, Z. Zhu, E. Gorbatov, H. David, and Z. Zhang. Software thermal
management of dram memory for multicore systems. SIGMETRICS Perform. Eval.
Rev., 36(1):337?348, June 2008.
[74] J. Lin, H. Zheng, Z. Zhu, and Z. Zhang. Thermal modeling and management of dram
systems. IEEE Transactions on Computers, 99(PrePrints), 2012.
[75] R.-S. Liu, C.-L. Yang, C.-H. Li, and G.-Y. Chen. Duracache: A durable ssd cache using
mlcnandflash. InProceedings of the 50th Annual Design Automation Conference, DAC
?13, pages 166:1?166:6, New York, NY, USA, 2013. ACM.
122
[76] Z.Liu, Y.Chen, C.Bash, A.Wierman, D.Gmach, Z.Wang, M.Marwah, andC.Hyser.
Renewable and cooling aware workload management for sustainable data centers. SIG-
METRICS Perform. Eval. Rev., 40(1):175?186, June 2012.
[77] J. Lu and F. Dawson. Emc computer modeling techniques for cpu heat sink simulation.
Magnetics, IEEE Transactions on, 42(10):3171?3173, Oct 2006.
[78] T. Luo, S. Ma, R. Lee, X. Zhang, D. Liu, and L. Zhou. S-cave: Effective ssd caching to
improvevirtual machinestorage performance. InProceedings of the 22Nd International
Conference on Parallel Architectures and Compilation Techniques, PACT ?13, pages
103?112, Piscataway, NJ, USA, 2013. IEEE Press.
[79] B. Mao, H. Jiang, S. Wu, L. Tian, D. Feng, J. Chen, and L. Zeng. Hpda: A hy-
brid parity-based disk array for enhanced performance and reliability. Trans. Storage,
8(1):4:1?4:20, Feb. 2012.
[80] R. Miller. Facebook0s $1 billion data center network. http://www.datacenterknow
ledge.com/archives/2012/02/02/facebooks-1-billion-data-center-network/,
February 2012.
[81] M. P. Mills. The cloud begins with coal: Big data, big networks, big infrastructure,
and big power. http://www.tech-pundit.com/wp-content/uploads/2013/07/Clou
d_Begins_With_Coal.pdf?c761ac&c761ac, August 2013.
[82] A. K. Mishra, S. Srikantaiah, M. Kandemir, and C. R. Das. Coordinated power man-
agementofvoltageislandsincmps. SIGMETRICS Perform. Eval. Rev., 38(1):359?360,
June 2010.
[83] J. Moore, J. Chase, P. Ranganathan, and R. Sharma. Making scheduling "cool":
temperature-aware workload placement in data centers. In Proceedings of the annual
conference on USENIX Annual Technical Conference, ATEC ?05, pages 5?5, Berkeley,
CA, USA, 2005. USENIX Association.
[84] D. Narayanan, A. Donnelly, and A. Rowstron. Write off-loading: Practical power
management for enterprise storage. In Proceedings of the 6th USENIX Conference on
File and Storage Technologies, FAST?08, pages 17:1?17:15, Berkeley, CA, USA, 2008.
USENIX Association.
[85] D. Narayanan, E. Thereska, A. Donnelly, S. Elnikety, and A. Rowstron. Migrating
server storage to ssds: Analysis of tradeoffs. In Proceedings of the 4th ACM European
Conference on Computer Systems, EuroSys ?09, pages 145?158, New York, NY, USA,
2009. ACM.
[86] Y. Oh, J. Choi, D. Lee, and S. H. Noh. Improving performance and lifetime of the
ssd raid-based host cache through a log-structured approach. In Proceedings of the 1st
Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads,
INFLOW ?13, pages 5:1?5:8, New York, NY, USA, 2013. ACM.
123
[87] E. Pakbaznia, M. Ghasemazar, and M. Pedram. Temperature-aware dynamic resource
provisioning in a power-optimized datacenter. In Proceedings of the Conference on
Design, Automation and Test in Europe, DATE ?10, pages 124?129, 3001 Leuven,
Belgium, Belgium, 2010. European Design and Automation Association.
[88] A. Pavlo, E. Paulson, A. Rasin, D. J. Abadi, D. J. DeWitt, S. Madden, and M. Stone-
braker. A comparison of approaches to large-scale data analysis. In Proceedings of the
2009 ACM SIGMOD International Conference on Management of data, SIGMOD ?09,
pages 165?178, New York, NY, USA, 2009. ACM.
[89] E. Pinheiro and R. Bianchini. Energy conservation techniques for disk array-based
servers. In Proceedings of the 18th annual international conference on Supercomputing,
ICS ?04, pages 68?78, New York, NY, USA, 2004. ACM.
[90] E. Pinheiro, W.-D. Weber, and L. A. Barroso. Failure trends in a large disk drive popu-
lation. In Proceedings of the 5th USENIX conference on File and Storage Technologies,
pages 2?2, Berkeley, CA, USA, 2007. USENIX Association.
[91] L. Ramos and R. Bianchini. C-oracle: Predictive thermal management for data cen-
ters. In High Performance Computer Architecture, 2008. HPCA 2008. IEEE 14th
International Symposium on, pages 111 ?122, feb. 2008.
[92] S.RenandY.He. Coca: Onlinedistributedresourcemanagementforcostminimization
and carbon neutrality in data centers. In Proceedings of the International Conference
on High Performance Computing, Networking, Storage and Analysis, SC ?13, pages
39:1?39:12, New York, NY, USA, 2013. ACM.
[93] A.RiskaandE.Smirni. Autonomicexplorationoftrade-offsbetweenpowerandperfor-
mance in disk drives. In Proceedings of the 7th International Conference on Autonomic
Computing, ICAC ?10, pages 131?140, New York, NY, USA, 2010. ACM.
[94] A. Sansottera and P. Cremonesi. Cooling-aware workload placement with performance
constraints. Perform. Eval., 68(11):1232?1246, Nov. 2011.
[95] O. Sarood, A. Gupta, and L. Kale. Temperature aware load balancing for parallel
applications: Preliminary work. In Parallel and Distributed Processing Workshops and
Phd Forum (IPDPSW), 2011 IEEE International Symposium on, pages 796 ?803, may
2011.
[96] O. Sarood and L. V. Kale. A ?cool? load balancer for parallel applications. In Proceed-
ings of 2011 International Conference for High Performance Computing, Networking,
Storage and Analysis, SC ?11, pages 21:1?21:11, New York, NY, USA, 2011. ACM.
[97] D. Schall, V. Hudlet, and T. H?rder. Enhancing energy efficiency of database applica-
tions using ssds. In Proceedings of the Third C* Conference on Computer Science and
Software Engineering, C3S2E ?10, pages 1?9, New York, NY, USA, 2010. ACM.
[98] P. Sehgal, V. Tarasov, and E. Zadok. Optimizing energy and performance for server-
class file system workloads. Trans. Storage, 6(3):10:1?10:31, Sept. 2010.
124
[99] A. Shah, V. Carey, C. Bash, C. Patel, and R. Sharma. Exergy analysis of data center
thermal management systems. In Y. Joshi and P. Kumar, editors, Energy Efficient
Thermal Management of Data Centers, pages 383?446. Springer US, 2012.
[100] M. Sharifi, H. Salimi, and M. Najafzadeh. Power-efficient distributed scheduling of
virtual machines using workload-aware consolidation techniques. J. Supercomput.,
61(1):46?66, July 2012.
[101] R. Sharma, C. Bash, C. Patel, R. Friedrich, and J. Chase. Balance of power: dynamic
thermal management for internet data centers. Internet Computing, IEEE, 9(1):42 ?
49, jan.-feb. 2005.
[102] J.-Y. Shin, M. Balakrishnan, L. Ganesh, T. Marian, and H. Weatherspoon. Gecko:
A contention-oblivious design for cloud storage. In Proceedings of the 4th USENIX
Conference on Hot Topics in Storage and File Systems, HotStorage?12, pages 4?4,
Berkeley, CA, USA, 2012. USENIX Association.
[103] K.Skadron, M.R.Stan, K.Sankaranarayanan, W.Huang, S.Velusamy, andD.Tarjan.
Temperature-aware microarchitecture: Modeling and implementation. ACM Trans.
Archit. Code Optim., 1(1):94?125, Mar. 2004.
[104] M. Song, Y. Lee, and E. Kim. Saving disk energy in video servers by combining caching
and prefetching. ACM Trans. Multimedia Comput. Commun. Appl., 10(1s):15:1?15:21,
Jan. 2014.
[105] J. Srinivasan and S. V. Adve. Predictive dynamic thermal management for multimedia
applications. In Proceedings of the 17th annual international conference on Supercom-
puting, ICS ?03, pages 109?120, New York, NY, USA, 2003. ACM.
[106] Statista. Number of monthly active facebook users worldwide from 3rd quarter 2008
to 1st quarter 2014 (in millions), 2014.
[107] C. Tan, J. Yang, J. Mou, and E. Ong. Three dimensional finite element model for
transienttemperaturepredictioninharddiskdrive. InMagnetic Recording Conference,
2009. APMRC ?09. Asia-Pacific, pages 1 ?2, jan. 2009.
[108] Q. Tang, S. Gupta, and G. Varsamopoulos. Thermal-aware task scheduling for data
centers through minimizing heat recirculation. In Cluster Computing, 2007 IEEE
International Conference on, pages 129 ?138, sept. 2007.
[109] Q. Tang, S. Gupta, and G. Varsamopoulos. Thermal-aware task scheduling for data
centers through minimizing heat recirculation. In Cluster Computing, 2007 IEEE
International Conference on, pages 129 ?138, sept. 2007.
[110] Q. Tang, S. K. S. Gupta, and G. Varsamopoulos. Energy-efficient thermal-aware
task scheduling for homogeneous high-performance computing data centers: A cyber-
physical approach. IEEE Trans. Parallel Distrib. Syst., 19(11):1458?1472, Nov. 2008.
[111] P. Thibodeau. Data centers use 2% of u.s. energy, below forecast, August 2011.
125
[112] W. Tian, Q. Xiong, and J. Cao. An online parallel scheduling method with application
to energy-efficiency in cloud computing. J. Supercomput., 66(3):1773?1790, Dec. 2013.
[113] N. Vasic, T. Scherer, and W. Schott. Thermal-aware workload scheduling for energy
efficient data centers. In Proceedings of the 7th international conference on Autonomic
computing, ICAC ?10, pages 169?174, New York, NY, USA, 2010. ACM.
[114] J. Whitney and J. Kennedy. Is cloud computing always greener? Technical report,
Natural Resources Defense Council, October 2012.
[115] G. Wu, X. He, and B. Eckart. An adaptive write buffer management scheme for
flash-based ssds. Trans. Storage, 8(1):1:1?1:24, Feb. 2012.
[116] T. Xie and Y. Sun. Understanding the relationship between energy conservation and
reliability in parallel disk arrays. J. Parallel Distrib. Comput., 71:198?210, February
2011.
[117] F. Yan, X. Mountrouidou, A. Riska, and E. Smirni. Quantitative estimation of the
performance delay with propagation effects in disk power savings. In Proceedings of
the 2012 USENIX Conference on Power-Aware Computing and Systems, HotPower?12,
pages 5?5, Berkeley, CA, USA, 2012. USENIX Association.
[118] G.-W. You, S.-W. Hwang, and N. Jain. Ursa: Scalable load and power management
in cloud storage systems. Trans. Storage, 9(1):1:1?1:29, Mar. 2013.
[119] M. Zapater, J. L. Ayala, and J. M. Moya. Leveraging heterogeneity for energy min-
imization in data centers. In Proceedings of the 2012 12th IEEE/ACM International
Symposium on Cluster, Cloud and Grid Computing (Ccgrid 2012), CCGRID ?12, pages
752?757, Washington, DC, USA, 2012. IEEE Computer Society.
[120] Z. Zhang and S. Fu. Characterizing power and energy usage in cloud computing
systems. In Proceedings of the 2011 IEEE Third International Conference on Cloud
Computing Technology and Science, CLOUDCOM ?11, pages 146?153, Washington,
DC, USA, 2011. IEEE Computer Society.
126