Minimum Power Consumption for Rate Monotonic Tasks
Except where reference is made to the work of others, the work described in this
thesis is my own or was done in collaboration with my advisory committee. This
thesis does not include proprietary or classified information.
Chiao Ching Huang
Certificate of Approval:
Cheryl Seals
Associate Professor
Computer Science and Software Engi-
neering
Sanjeev Baskiyar, Chair
Associate Professor
Computer Science and Software Engi-
neering
Tin-Yau Tam
Professor
Mathematics and Statistics
George T. Flowers
Dean
Graduate School
Minimum Power Consumption for Rate Monotonic Tasks
Chiao Ching Huang
A Thesis
Submitted to
the Graduate Faculty of
Auburn University
in Partial Fulfillment of the
Requirements for the
Degree of
Master of Science
Auburn, Alabama
December 19, 2008
Minimum Power Consumption for Rate Monotonic Tasks
Chiao Ching Huang
Permission is granted to Auburn University to make copies of this thesis at its
discretion, upon the request of individuals or institutions and at
their expense. The author reserves all publication rights.
Signature of Author
Date of Graduation
iii
Vita
Chiao Ching Huang was born in Tainan City, Taiwan, on March 19, 1978. She
graduated from Feng-Chia Universitywith a Bachelor of Science degree in 2000. After
working for Chinese PetroleumCorp. for almost fiveyears, in 2005 fall, she enteredthe
graduate program in the Department of Computer Science and Software Engineering
at Auburn University. While in pursuit of her Master of Science degree, she worked
as a graduate assistant for Auburn University Graduate School under Deans George
Flowers and Joe Pittman.
iv
Thesis Abstract
Minimum Power Consumption for Rate Monotonic Tasks
Chiao Ching Huang
Master of Science, December 19, 2008
(B.S., Feng Chia University, 2000)
62 Typed Pages
Directed by Sanjeev Baskiyar
Embedded computer systems are widely used in modern life and their use is ex-
panding. One of the typical constraints in embedded systems, particularly in stand-
alone devices, is their low power capacity. One way to expand the lifetimeof battery is
to reduce its power consumption; because of the quadratic relationship between power
consumption in CMOS circuits and CPU voltage, researchers now can achieve power
reduction by scaling down its supply voltage by applying Dynamic Voltage Scaling
(DVS). However, reducing supply voltage also slows down CPU speed since supply
voltage has a proportional relationship with clock frequency of processor, namely,
CPU speed. As a result, DVS succeeds at the cost of system performance. However,
in a Real-Time embedded environment, especially in Hard Real-Time embedded Sys-
tems, timing constraint is a critical element that cannot be ignored. Therefore, it is
difficult to balance the power savings and system throughput so that all tasks will
still complete before their deadlines.
v
In this thesis, we focus on tasks scheduled by Rate Monotonic (RM) algorithm in
a Hard Real-Time embedded environment. We derive an equation for scaling worst
case execution time (WCET) of each task and expanding each WCET with differ-
ent factors according to corresponding task computation time until slack times are
fully occupied. Different from other approaches, we combine the power consumption
equation with the constraint of Rate Monotonic schedulability test (RM test). From
the result of this solution, we find the minimum power consumption, at which the
RM task set can still pass the RM test and guarantee all tasks will meet deadlines.
Our approach can be categorized as an off-line Intra-Task Dynamic Voltage Scaling
(IntraDVS).
vi
Acknowledgments
I would like to thank God for giving me an opportunity to attend graduate
school and pursue a master?s degree. I would also like to thank Him for the immense
strength, constant support and unconditional love.
I would like to extend my sincere appreciations to the members of my master?s
committee for their knowledge, time and assistance.
I am thankful to my academic advisor, Dr. Sanjeev Baskiyar, for his support, pa-
tience, encouragement and priceless guidance. His enthusiasm, creativity, and careful
review have helped me accomplish an enormous task.
I am especially thankful to Dr. Tin-Yau Tam for guiding me through the most
critical part of my research. The math proofs and the Matlab program were written
under his valuable directions.
I would like to take this opportunity to thank the Graduate School for providing
me with an assistantship and financial support.
Lastly, I thank my beloved family and friends for their endless love, support,
patience and encouragement.
vii
Stylemanual or journal used Journal ofApproximationTheory (together with the
style known as ?aums?). Bibliograpy follows van Leunen?s A Handbook for Scholars.
Computer software used The document preparation package T
E
X (specifically
L
A
T
E
X) together with the departmental style-file aums.sty.
viii
Table of Contents
List of Figures x
List of Tables xi
1 Introduction 1
1.1 Current Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation................................. 2
1.3 ThesisOutline............................... 2
2 Dynamic Voltage Scaling 3
2.1 CMOS Circuit Design . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 DVS Algorithm with DVS Processor . . . . . . . . . . . . . . . . . . 5
2.3 Problems.................................. 8
3 Formalization 9
3.1 Rate Monotonic Schedulability Test . . . . . . . . . . . . . . . . . . . 9
3.2 Power Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3 Energy Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.4 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4 Solution to the optimization problem 13
4.1 Mathematical Statements and Proofs . . . . . . . . . . . . . . . . . . 13
4.2 Scaling Factor Assignment Algorithm . . . . . . . . . . . . . . . . . . 35
4.3 Revised Scaling Factor Assignment Algorithm . . . . . . . . . . . . . 39
5 Result and Analysis 41
6 Summary 47
Bibliography 49
Appendices 51
ix
List of Figures
4.1 Plot of f(X
2
) ............................... 34
5.1 Time-line for Task Set A before scaling, where computation times are
specified at the maximum CPU frequency . . . . . . . . . . . . . . . 42
5.2 Scaled Time-line for Task Set A . . . . . . . . . . . . . . . . . . . . . 44
5.3 Time-line for Task Set B . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.4 Scaled Time-line for Task Set B . . . . . . . . . . . . . . . . . . . . . 46
x
List of Tables
4.1 Procedure ScaleFactors Pseudocode . . . . . . . . . . . . . . . . . . . 36
4.2 Procedure ScaleFactorsDnC Pseudocode . . . . . . . . . . . . . . . . 38
4.3 Procedure ScaleFactorsBackward Pseudocode . . . . . . . . . . . . . 39
5.1 Example Task Set A . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.2 Energy Consumption before Scaling for Task Set A . . . . . . . . . . 42
5.3 Energy Consumption after Scaling for Task Set A . . . . . . . . . . . 43
5.4 Example Task Set B . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.5 Energy Consumption before Scaling for Task Set B . . . . . . . . . . 45
5.6 Energy Consumption after Scaling for Task Set B . . . . . . . . . . . 46
xi
Chapter 1
Introduction
1.1 Current Problem
Maturity of wireless technology and mobile markets have dramatically changed
lifestyles of people. More and more consumers depend on wireless or portable devices,
such as cellular phone, personal digital assistant (PDA), digital camera or global
positioning system (GPS). With the greatly growing popularity of handheld devices,
power supply or battery lifetimehas always been one of the major determinants in the
purchase decision. For decades, scientific researchers have sought to find the optimal
solution to save power in handhelds and produce more energy efficient power supplies.
As far as Real-Time embedded systems are concerned, power consumption be-
comes an even bigger issue. One of the well-known characteristics of a Real-Time
Operating System is its time constraint. In a Soft Real-Time system, missed dead-
lines may not affect the whole system?s functionality. However, a missed deadline in
a Hard Real-Time system could lead to a non-recoverable system failure. Reducing
power dissipation often means sacrificing system performance. That is, tasks have
the potentials of late completions, which cannot be tolerated in a Hard Real-Time
system [12].
1
1.2 Motivation
This thesis focuses on Rate Monotonic (RM) schedulable tasks for Hard Real-
Time embedded systems. The task model is preemptive and static priority driven.
The approach is to expand computation time of each task until its combined CPU
utilization approximately reaches the upper bound of Rate Monotonic schedulability
test (RM test) [9]; thus, lowering CPU speed and supplied voltage, yet guaranteeing
none of the tasks misses its deadline. With the method of Lagrange Multiplier, the
approach deduces a specific formula and iteration yielding the exact scaling factor for
each task. Then, the computation time of each corresponding task is increased by
applying this scaling factor. Consequently, the minimum power consumption can be
achieved and all the tasks will be processed within their time constraints.
1.3 Thesis Outline
The remainder of this thesis is organized as follows. Chapter 2 reviews recent
studies. Specifically, different kinds of DVS approaches and their challenges will be
introduced and analyzed. Chapter 3 gives a detailed explanation for the formulation
and deduction of the problem. In Chapter 4, we discuss mathematical part and
explain the solution to the problem described in Chapter 3. We provide a complete
proof and present the algorithm inferred from this study. Chapter 5 includes some
experiments and observations during this research and discusses issues regarding the
result and suggestions for future work. Chapter 6 concludes this study.
2
Chapter 2
Dynamic Voltage Scaling
A large number of research activities have been conducted in the past decades
to achieve power reduction. In this chapter, we introduce Dynamic Voltage Scaling
(DVS) techniques and present some of its strategies.
Before we start, we explain some terms in scheduling Hard Real-Time tasks.
Worst Case Execution Times (WCETs)
As implied by the name, worst case execution time (WCET) is the estimated
maximum possible run-time for a request to complete its task without missing its
deadline. Just as W. Kim et al.[7] explained, ?In scheduling Hard Real-Time tasks,
in order to guarantee the timing constraint of each task, the execution times of tasks
are usually assumed to be the worst case execution times (WCETs).? WCET is often
used as a standard design scheme to schedule Hard Real-Time tasks.
Slack Time
Slack times can be classified as static slack times and dynamic slack times [7].
Static slack time is defined as a time frame between the moment a task finishes its
execution and its next invocation. Off-line (or static) approaches, e.g. static voltage
scaling, path-based method, stochastic method and maximum constant speed should
3
fully utilize these static slack times known in advance by reducing CPU frequency
and consequently reducing supply voltage.
Dynamic slack time is the difference between actual execution time and WCET
1
.
The actual task execution time is unpredictable because of processor speed at run-
time. In fact, a large number of tasks finish earlier than their WCETs [5]. Due to
this reason, static approaches cannot significantly save power consumption. On-line
approaches such as stretching to NTA, priority-based slack stealing and utilization
updating exploit this characteristic by reclaiming dynamic slack times and recompute
CPU frequency at task release time to lower power consumption.
Dynamic Voltage Scaling (DVS) is one way to reduce power consumption and it
basically tries to change power supply voltage dynamically to slow down CPU speed
in order to save energy. DVS is based on two concepts and has been a mainstream in
the area of power reduction. One concept is the quadratic relationship between power
dissipation and supply voltage (p?V
2
). Another concept of developing DVS is that
even if in Hard Real-Time environment, as long as tasks can complete on time, there
is no reason to require higher system throughput.
2.1 CMOS Circuit Design
Since the hardware makers make ?limitless? number of transistors possible, A. P.
Chandrakasan et al. [2] came up with an idea of redesigning system architecture by
1
In this case, actual execution time is less than WCET
4
delaying CMOS circuit behaviors as much as possible while using the lowest poten-
tial supply voltage resulting in dynamic power dissipation. This architecture-driven
voltage scaling technique is capable of achieving the goal of power savings without
regard to its complexity.
2.2 DVS Algorithm with DVS Processor
With the help of CMOS circuit technique improvement, more and more variable-
voltage microprocessors based on CMOS logic have been brought to the market.
Researchers then exploited this trend to develop all kinds of DVS algorithms so as to
make use of these available DVS processors.
DVS algorithms can be further classified into intra-task DVS (IntraDVS) and
inter-task DVS (InterDVS) algorithms according to how they use slack times. In-
traDVS passes on slack times from the current task to itself in the next cycle [4, 13].
Usually, IntraDVS determines supply voltage off-line. InterDVS distributes the slack
times from the current task to the following tasks. Most currently existing approaches
belong to InterDVS [14, 12, 5, 3, 8]. Besides these two, researchers further incorporate
the strengths of IntraDVS and InterDVS and comprise a hybrid DVS (HybridDVS)
algorithm [7, 3, 8].
First DVS Algorithm was proposed by M. Weiser et al. [17]. This approach is
an average throughput-based DVS algorithm and only considers Soft Real-Time envi-
ronment. It cannot guarantee Hard Real-Time deadlines will be met. Shin and Choi
[14] considered delay overheads from power-down mode and voltage switching, and
5
developed Low Power Fixed Priority Scheduling (LPFPS) with slight modification of
Real-Time scheduler, in which they combined off-line and on-line approaches. How-
ever, voltage switching incurs delay overhead. Therefore, their solution still needs a
trade-off analysis between increased task execution time and power consumption.
Similar to our approach, the scheme proposed by Manzak and Chakrabarti [11]
also tried to find the operating voltage for the processor while it executes each task.
They formulate their equation to minimize total energy consumption subject to a
constant total computation time. With the method of Lagrange Multiplier, Manzak
and Chakrabarti obtain a relationship among task voltages with the constraint of
minimum energy consumption. They further develop an iterative algorithm accord-
ingly to adjust task slack times and therefore adjust voltage assignments. For RM
scheduling, the iterative voltage assignments start until combined CPU utilization
reaches upper bound of RM test. Still, there is a trade off between energy/power
savings and the complexity of this algorithms.
Stachastic IntraDVS described by F. Gruian [4] not only scaled voltage to fill
static slack times but also concerned dynamic slack times caused by run-time task
executions and combined on-line slack distribution method to achieve the goal of
minimizing energy consumption while meeting all deadlines.
Swaminathan and Chakrabarty [15] took into consideration the effect of volt-
age switching times on the energy consumption of the task set and presented a
novel mixed-integer linear programming model called extended-low-energy earliest-
deadline-first (E-LEDF) for the NP-complete scheduling algorithm. Swaninathan
6
and Chakrabarty?s approach is specified for non-preemptive Earliest-Deadline-First
(EDF) task set rather than preemptive RM task set.
Pillai and Shin [12] used a scaling factor obtained from lowest CPU frequencies
in a given task set for all tasks and the results show statically scaled RM tasks
cannot reduce energy consumption aggressively. Y. H. Tsai et al. [8] further exploited
Pillai and Shin?s static voltage scaling approach and incorporate with their Deferred-
Workload-based inter-task DVS (dwDVS).
Liu and Mok [10] defined the available cycle function (ACF) and the required
cycle function (RCF) to limit CPU speed so that there is no idle cycle when the
CPU executes and no job misses its deadline. To solve energy minimization problem,
they propose an off-line algorithm and an low complexity on-line dynamic algorithm
to reclaim dynamic slack cycles. Liu and Mok?s solution can be used with different
Real-Time schedulers.
Except that W. Kim et al. [13], A. Manzak and C. Chakrabarti [11] only used
off-line approaches, most of researchers combined on-line and off-line approaches to
achieve power savings. Shin and Choi [14] combined not only off-line, on-line ap-
proaches but also brought the processor to a power-down mode. Hakan et al. [5]
included an off-line solution to compute the optimal speed and presented on-line
speed adjustment mechanism to reclaim dynamic slack times and further predict task
future execution. Similarly, Gruian [4] addressed off-line task stretching to obtain
scaling factor and incorporated on-line slack distribution to further utilize dynamic
slack times.
7
2.3 Problems
Although off-line DVS algorithms have lower complexity and are easy to imple-
ment, on-line algorithms can result in even more energy savings; however, due to the
difficulty of run-time workload?s prediction as mentioned above, on-line approaches
also have higher complexity and higher chances to miss tasks deadlines. In order to
compromise with on-line approaches, researchers often speed up tasks to catch up with
their deadlines. Yet, accelerating tasks also presents increasing supply voltage from
CPU frequency. Therefore, it rarely guarantees the minimum power consumption;
instead, it has the risk to cause higher energy.
Also, even though some of above studies are for Hard Real-Time environmentand
the task set is scheduled with RM algorithm, few of them mentioned if the combined
CPU utilization of task set passes the upper bound of RM test. In other words, it
could still meet all the deadlines of tasks in run-time but without assurance.
8
Chapter 3
Formalization
Since this research is based on RM scheduling algorithm, the system model has
the same assumption as Liu and Layland?s [9]. Given a task set of n number of tasks
t
i
, i =1,2,...,n, we assume that the time period T
i
, i =1,2,...,n, are periodic and
each deadline is equal to its period. All the tasks are independent and has its own
computation time C
i
, i =1,2,...,n.
3.1 Rate Monotonic Schedulability Test
The combined CPU utilization u for multiprogramming in a Hard Real-Time
Embedded Operating System is:
u =
n
summationdisplay
i=1
C
i
T
i
=
C
1
T
1
+
C
2
T
2
+???+
C
n
T
n
(3.1)
where C
i
is the computation time, T
i
is the time period and n is the number of tasks
in a task set.
For Rate Monotonic (RM) Scheduling Algorithm, if u ? n(2
1
n
?1), Liu and
Layland [9] indicated that the tasks t
i
, i =1,2,...,n, will be schedulable and will
not miss their deadlines and is called Rate Monotonic schedulability test (RM test).
Thus, the upper bound of RM test is n(2
1
n
?1). RM test ?is said to be sufficient but
9
not necessary.? [1] On the other hand, if a task set passes RM test, it will meet all
deadlines; if it fails the test, it may or may not fail at run-time.
3.2 Power Consumption
The power consumed per CPU cycle is
p = ?fNV
2
(3.2)
where p is the power dissipation, ? is the average activity factor, f is the clock
frequency of processor or the CPU frequency, N is the switching capacitance of wires
and transistors gates and V is the supply voltage.
From (3.2), the power dissipation p is proportional to CPU frequencyf, switching
capacitance N and supply voltage V
2
. Since switching capacitance N is proportional
to the numbers of transistors on the chip [16] and all the tasks use the same processor,
N is constant. We only consider to reducing CPU frequency f or the supply voltage
V here to achieve power reduction.
Since V is approximately proportional to CPU frequency f, reducing f also
reduces V proportionally [18]. We can say the power dissipation p is proportional to
the cube of its CPU frequency f.
p?f
3
(3.3)
Reducing the CPU frequency f by a factor
1
X
i
, where X
i
is a scaling factor and is
always greater than or equal to 1, results in increasing the computation time C
i
to
10
X
i
C
i
. Thus we have the new combined CPU utilization equation u
prime
as follows:
u
prime
=
n
summationdisplay
i=1
X
i
C
i
T
i
=
X
1
C
1
T
1
+
X
2
C
2
T
2
+???+
X
n
C
n
T
n
?n(2
1
n
?1) (3.4)
3.3 Energy Consumption
The energy consumption E
i
for executing task t
i
is:
E
i
?C
i
?p
i
(3.5)
where C
i
is the computation timeof task t
i
, p
i
is the power dissipation of theprocessor.
According to (3.3), we could modify energy consumption equation E
i
as the
following form
E
i
?C
i
?f
i
3
(3.6)
where f
i
is the CPU frequency.
Sincef is reduced to
f
X
i
, along with the equivalentscaling X
i
C
i
, the scaled energy
consumption E
prime
i
becomes
E
prime
i
?X
i
C
i
?(
f
X
i
)
3
=
C
i
?f
3
X
2
i
(3.7)
3.4 Problem Definition
Based on the above derivations, we can restate our problem in the following form:
11
Given u =
n
summationtext
i=1
C
i
T
i
?n(2
1
n
?1), our goal is to find
?
X := (
?
X
1
,...,
?
X
n
) which yields
min
n
summationdisplay
i=1
C
i
?
X
2
i
subject to the constraints
n
summationtext
i=1
?
X
i
C
i
T
i
?n(2
1
n
?1) and 1?
?
X
i
?
T
i
C
i
.
Associated with our problem are the following parameters:
? u: combined CPU utilization
? n: number of tasks in a task set
? C
i
: computation time of task t
i
? T
i
: time period of task t
i
? X
i
: scaling factor of task t
i
12
Chapter 4
Solution to the optimization problem
4.1 Mathematical Statements and Proofs
Based on the above derivations, we restate the problem in the following mathe-
matical form:
Problem: Find
?
X := (
?
X
1
,...,
?
X
n
) which yields
min
n
summationdisplay
i=1
C
i
X
2
i
, (4.1)
where 0 <C
i
for all i, subject to the constraints
1.
summationtext
n
i=1
X
i
C
i
T
i
?n(2
1/n
?1) with
summationtext
n
i=1
C
i
T
i
?n(2
1/n
?1).
2. 1?X
i
?
T
i
C
i
, i =1,...,n.
Let
f(X):=
n
summationdisplay
i=1
C
i
X
2
i
.
Here n is the number of tasks in a task set, C
i
is the computation time of task t
i
, T
i
is the time period of task t
i
, and X
i
is the scaling factor of task t
i
.
When n = 1 the minimization problem is trivial: f(X)=
C
1
X
2
1
and its minimum
is attained at X =
T
1
C
1
. So from now on we only need to consider n?2.
13
Remark 4.1. Set
S ={x?R
n
:
n
summationdisplay
i=1
X
i
C
i
T
i
?n(2
1/n
?1),1?X
i
?
T
i
C
i
}. (4.2)
The minimization problem can be written as: find
?
X ?S that gives
min
X?S
f(X).
The condition
summationtext
n
i=1
C
i
T
i
?n(2
1/n
?1) ensures that the point (1,...,1) is in the region.
So the constraint set S (the domain of the minimizationproblem) is nonempty. Notice
that S is a closed and bounded set in R
n
, that is, S is a compact set. Moreover the
function f(X) is continuous on the compact set S and f is always positive over S.
By Weierstrass?s theorem [6], there is a minimum of f in S. So the minimization
problem (4.1) is solvable. However Weierstrass?s theorem does not provide a mean to
find the point(s)
?
X ?S such that f(
?
X) is the minimum.
Lemma 4.2. The sequence h(n):=n(2
1/n
?1) is strictly monotonic decreasing and
lim
n??
h(n) = ln2. Moreover h(n) < 1 for all positive integers n except n =1;
h(1) = 1.
14
Proof. Clearly h(1) = 1. Let a := 2
1
n(n+1)
> 1. By using the identity a
n
?1=
(a?1)(a
n?1
+ a
n?2
+???+ a + 1), we obtain
h(n)?h(n +1) = n(2
1
n
?1)?(n + 1)(2
1
n+1
?1)
= n
bracketleftbigg
parenleftBig
2
1
n(n+1)
parenrightBig
n+1
?1
bracketrightbigg
?(n +1)
bracketleftBigparenleftBig
2
1
n(n+1)
parenrightBig
n
?1
bracketrightBig
=(a?1)
bracketleftbig
n(a
n
+ a
n?1
???+1)?(n + 1)(a
n?1
+???+1)
bracketrightbig
=(a?1)[na
n
?(a
n?1
+???+ 1)]?0
since a
n
?a
i
for i =1,...,n?1asa>1. So h(n) is strictly monotonic decreasing.
By l?Hospital?s rule,
lim
n??
h(n) = lim
n??
2
1/n
?1
1
n
= lim
n??
?
1
n
2
2
1/n
ln2
?
1
n
2
= lim
n??
2
1/n
ln2 = ln2.
Remark 4.3. When n?2, one can rewrite the set S as
S ={x?R
n
:
n
summationdisplay
i=1
X
i
C
i
T
i
?n(2
1/n
?1),1?X
i
},
i.e., one can ignore the condition X
i
?
T
i
C
i
in (4.2) since it follows from
summationtext
n
i=1
X
i
C
i
T
i
?
n(2
1/n
?1) < 1 by Lemma 4.2 and 1?X
i
. But we keep using (4.2) as the description
of S.
15
Each X ?S must satisfy
X
i
<
T
i
C
i
,i=1,...,n, (4.3)
otherwise X
j
=
T
j
C
j
for some j so that by Lemma 4.2 we would have
n
summationdisplay
i=1
X
i
C
i
T
i
?
X
j
C
j
T
j
=1>n(2
1/n
?1),
a contradiction.
Lemma 4.4. The minimization problem (4.1) only has solution(s) in the compact
set R, where
R :={x?R
n
:
n
summationdisplay
i=1
X
i
C
i
T
i
= n(2
1/n
?1),1?X
i
<
T
i
C
i
}?S. (4.4)
Proof. Set
X
epsilon1
:= X + epsilon1(1,...,1)=(X
1
+ epsilon1,...,X
n
+ epsilon1).
If the minimum of f(X) occurred at X over the region defined by
n
summationdisplay
i=1
X
i
C
i
T
i
<n(2
1/n
?1),
then by (4.3) for sufficiently small epsilon1
16
1?(X
epsilon1
)
i
?
T
i
C
i
i =1,...,n
and
n
summationdisplay
i=1
(X
epsilon1
)
i
C
i
T
i
=
n
summationdisplay
i=1
(X
i
+ epsilon1)C
i
T
i
=
n
summationdisplay
i=1
X
i
C
i
T
i
+ epsilon1
n
summationdisplay
i=1
C
i
T
i
<n(2
1/n
?1).
In other words, X
epsilon1
?S. Clearly f(X
epsilon1
) <f(X), a contradiction.
From now on we set
K := n(2
1/n
?1).
By Lemma 4.4 the minimization problem 4.1 is reduced to: find
?
X ?R that gives
min
X?R
f(X). (4.5)
The following result revealsa special ordering of the solution according to the ordering
of T?s.
Theorem4.5. Suppose that T
?(1)
?????T
?(n)
with somepermutation ? :{1,...,n}?
{1,...,n}and the minimization problem attains its minimum at
?
X =(
?
X
1
,...,
?
X
n
)?
R. Then
?
X
?(1)
?????
?
X
?(n)
(? 1). In particular, if T
1
?????T
n
, then
?
X
1
?????
?
X
n
.
Proof. Without loss of generality, we may assume that ? is the identity, i.e., T
1
?
????T
n
. Suppose on the contrary that
?
X
j
>
?
X
i
(?1) for some 1?i<j?n.By
17
(4.3)
?
X
i
<
T
i
C
i
,i=1,...,n.
Let
?
X
i
(?)=
?
X
i
+ ?,
?
X
j
(?)=
?
X
j
??
C
i
T
j
C
j
T
i
and
?
X
k
(?)=
?
X
k
for all knegationslash= i,j.So
n
summationdisplay
k=1
C
k
?
X
k
(?)
T
k
=
summationdisplay
knegationslash=i,j
C
k
?
X
k
(?)
T
k
+
C
i
?
X
i
(?)
T
i
+
C
j
?
X
j
(?)
T
j
= K.
Since
?
X
i
(?) <
T
i
C
i
and 1 <
?
X
j
(?) for sufficiently small ?>0, we have
?
X(?)?R. Set
?(?):=f(
?
X(?))
=
C
1
?
X
2
1
+???+
C
i
(
?
X
i
(?))
2
+???+
C
j
(
?
X
j
(?))
2
+???+
C
n
?
X
2
n
=
C
1
?
X
2
1
+???+
C
i
(
?
X
i
+ ?)
2
+???+
C
j
(
?
X
j
??
C
i
T
j
C
j
T
i
)
2
+???+
C
n
?
X
2
n
.
Now
d?(?)
d?
=
?2C
i
(
?
X
i
+ ?)
3
+
2
C
i
T
j
T
i
(
?
X
j
??
C
i
T
j
C
j
T
i
)
3
=
2C
i
bracketleftBig
(
?
X
i
+ ?)
3
T
j
T
i
?(
?
X
j
??
C
i
T
j
C
j
T
i
)
3
bracketrightBig
(
?
X
i
+ ?)
3
(
?
X
j
??
C
i
T
j
C
j
T
i
)
3
.
18
So
?
prime
(0) =
2C
i
bracketleftBig
?
X
3
i
T
j
T
i
?
?
X
3
j
bracketrightBig
?
X
3
i
?
X
3
j
< 0,
since 1 ?
?
X
i
<
?
X
j
and T
i
?T
j
> 0. This contradicts the assumption that f(
?
X)is
the minimum.
We will first solve a slightly different problem: find the X ?R
prime
that yields
min
X?R
prime
f(X)
where
R
prime
={X ?R
n
:
n
summationdisplay
i=1
X
i
C
i
T
i
= K}.
In other words, we remove the condition 1?X
i
?
T
i
C
i
from the original minimization
problem (4.1). Note that R?R
prime
and the region R
prime
is represented by the equation
n
summationdisplay
i=1
X
i
C
i
T
i
= K
and is a hyperplane in R
n
. By the method of Lagrange Multiplier, set
?f = ??g
which implies
?
2C
i
X
3
i
= ?
C
i
T
i
.
19
Since C
i
negationslash=0,
X
L
i
=
parenleftbigg
2T
i
??
parenrightbigg
1/3
. (4.6)
Substitute (4.6) into
summationtext
n
i=1
X
i
C
i
T
i
= K to have
n
summationdisplay
j=1
parenleftbigg
2T
j
??
parenrightbigg
1/3
C
j
T
j
= K.
Thus we have
1
(??)
1/3
n
summationdisplay
j=1
(2T
j
)
1/3
C
j
T
j
= K. (4.7)
So from (4.6) and (4.7)
X
L
i
=
(2T
i
)
1/3
K
summationtext
n
j=1
(2T
j
)
1/3
C
j
T
j
=
T
1/3
i
K
summationtext
n
j=1
T
1/3
j
C
j
T
j
<
T
i
C
i
(4.8)
(since K<1) which is the only critical point of f over R
prime
. Moreover
f(X
L
)=
n
summationdisplay
i=1
C
i
(X
L
i
)
2
=
(
summationtext
n
j=1
T
1/3
j
C
j
T
j
)
2
K
2
(
n
summationdisplay
i=1
C
i
T
2/3
i
) (4.9)
is the local minimum value of f(X)overR
prime
. Thus it is the global minimum over R
prime
.
If T
1
?????T
n
, then from (4.8)
X
L
1
?????X
L
n
.
20
However we cannot conclude that 1?X
L
i
for all i =1,...,n, i.e., X
L
may not be in
R since the second constraint of (4.1) is not satisfied. Nevertheless, it is clear that
f(X
L
) = min
X?R
prime
f(X)?min
X?R
f(X). (4.10)
Theorem 4.6. Let T
1
?????T
n
. Let
?
X =(
?
X
1
,...,
?
X
n
)?R be a solution to the
minimization problem (4.5).
1. X
L
n
? 1 if and only if
?
X
n
? 1. In this case,
?
X = X
L
is a solution to the
minimization problem (4.1).
2. If X
L
n
< 1, then
?
X
1
?????
?
X
r?1
>
?
X
r
=???=
?
X
n
= 1 (4.11)
for some r =1,...,n. Moreover (
?
X
1
,...,
?
X
r?1
) is a solution to the following
minimization problem
min
r?1
summationdisplay
i=1
C
i
X
2
i
+
n
summationdisplay
i=r
C
i
subject to the (new) conditions
(a) 1?X
i
?
T
i
C
i
, i =1,...,r?1,
(b)
summationtext
r?1
i=1
C
i
X
i
T
i
?
?
K, where
?
K := K?
summationtext
n
i=r
C
i
T
i
.
Namely,
?
X
i
=
T
1/3
i
?
K
summationtext
r?1
j=1
T
1/3
j
C
j
T
j
,i=1,...,r?1. (4.12)
21
In either case,
?
X is the unique solution to the minimization problem (4.5).
Proof. By Theorem 4.5 the solution
?
X ?R to the minimizationproblem (4.5) satisfies
?
X
1
?????
?
X
n
.
(1) By (4.8) X
L
1
?????X
L
n
, X
L
i
<
T
i
C
i
for all i =1,...,n, and
summationtext
n
i=1
X
L
i
C
i
T
i
= K.
So X
L
n
?1 if and only if X
L
?R. In this case, it amounts to
?
X = X
L
by (4.10) and
it is the unique solution since we only have one critical point for the minimization
problem over R
prime
.
(2) If X
L
n
< 1, then X
L
negationslash? R. Since X
L
is the only critical point in R
prime
and
R?R
prime
, we conclude that the minimum point(s) of f over R must be in the boundary
?R of R:
?R:={x?R
n
:
n
summationdisplay
i=1
X
i
C
i
T
i
= K,X
i
= 1 for some i or X
j
=
T
j
C
j
for some j}.
Since X
i
<
T
i
C
i
for all i by (4.3),
?
X must occur in
?
?R :={x?R
n
:
n
summationdisplay
i=1
X
i
C
i
T
i
= K,X
i
= 1 for some i}.
In other words, via Theorem 4.5 we have (4.11), i.e.,
?
X
1
?????
?
X
r?1
>
?
X
r
=???=
?
X
n
=1,
22
for some r =1,...,n. Because the region
R
r
:={(X
1
,...,X
r?1
,1,...,1)?R
n
:
r?1
summationdisplay
i=1
C
i
X
i
T
i
+
n
summationdisplay
i=r
C
i
T
i
= K, 1?X
i
?
T
i
C
i
,i=1,...,r?1}
is a subset of R,(
?
X
1
,...,
?
X
r?1
) is the solution to the following minimization problem
min
r?1
summationdisplay
i=1
C
i
X
2
i
+
n
summationdisplay
i=r
C
i
subject to the (new) constraints
1. 1?X
i
?
T
i
C
i
, i =1,...,r?1,
2.
summationtext
r?1
i=1
C
i
X
i
T
i
?
?
K, where
?
K := K?
summationtext
n
i=r
C
i
T
i
.
Since T
1
?????T
r?1
and
?
X
r?1
> 1, by Theorem 4.6(1) the minimum is attained at
(
?
X
1
,???,
?
X
r?1
) which is provided by the method of Lagrange multiplier, i.e.,
?
X
i
=
T
1/3
i
?
K
summationtext
r?1
j=1
T
1/3
j
C
j
T
j
> 1,i=1,...,r?1.
Moreover the solution is unique once r is fixed.
To show that
?
X is unique, it is sufficient to show that r is unique. Suppose
on the contrary that (
?
X
1
,...,
?
X
r?1
,1,...,1)?R
n
and (
?
X
1
,...,
?
X
r
prime
?1
,1,...,1)?R
n
both yield the minimum, i.e.,
f((
?
X
1
,...,
?
X
r?1
,1,...,1)) = f((
?
X
1
,...,
?
X
r
prime
?1
,1,...,1)) = min
X?R
f(X)
23
but rnegationslash= r
prime
. In other words,
?
X
i
=
T
1/3
i
?
K
summationtext
r?1
j=1
T
1/3
j
C
j
T
j
> 1,i=1,...,r
prime
?1.
For definiteness, assume r<r
prime
. Clearly R
r
?R
r
prime and (
?
X
1
,...,
?
X
r
prime
?1
) is the unique
solution to the following minimization problem
min
r
prime
?1
summationdisplay
i=1
C
i
X
2
i
+
n
summationdisplay
i=r
prime
C
i
subject to the constraints
1. 1?X
i
?
T
i
C
i
, i =1,...,r
prime
?1,
2.
summationtext
r
prime
?1
i=1
C
i
X
i
T
i
?
?
K, where
?
K := K?
summationtext
n
i=r
prime
C
i
T
i
.
We would then have f((
?
X
1
,...,
?
X
r?1
,1,...,1)) >f((
?
X
1
,...,
?
X
r
prime
?1
,1,...,1)), a con-
tradiction. So r is unique. Hence (
?
X
1
,...,
?
X
r?1
) and
?
X are unique.
When X
L
n
< 1, to solve Problem 4.1, it suffices to determine r?1 which is the
largest index i such that
?
X
i
> 1 and apply (4.12). The following is very useful with
respect to the determination of r?1.
Proposition 4.7. Let T
1
?????T
n
. Let r?1 be the largest integer i such that
?
X
i
> 1(r?1 = 0 means that all
?
X?s are 1). Then
r?1?s
1
(4.13)
24
where s
1
denotes the largest integer i such that X
L
i
> 1(s
1
= 0 means that all X
L
?s
are less than or equal to 1), i.e.,
X
L
1
?????X
L
s
1
> 1?X
L
s
1
+1
?????X
L
n
.
Evidently, if X
L
n
?1, then
?
X = X
L
and r?1=s
1
.
Proof. On the contrary suppose that r?1 >s
1
. We would have r?1?s
1
+1so
that X
L
r?1
?1. From (4.8)
X
L
r?1
=
T
1/3
r?1
K
summationtext
n
j=1
T
1/3
j
C
j
T
j
?1?? K ?
summationtext
n
j=1
T
1/3
j
C
j
T
j
T
1/3
r?1
.
So
K
parenleftBigg
n
summationdisplay
j=r
T
1/3
j
C
j
T
j
parenrightBigg
?
parenleftBig
summationtext
n
j=1
T
1/3
j
C
j
T
j
parenrightBig
T
1/3
r?1
parenleftBigg
n
summationdisplay
j=r
T
1/3
j
C
j
T
j
parenrightBigg
=
parenleftBigg
n
summationdisplay
j=r
parenleftbigg
T
j
T
r?1
parenrightbigg
1/3
C
j
T
j
parenrightBiggparenleftBigg
n
summationdisplay
j=1
T
1/3
j
C
j
T
j
parenrightBigg
?
parenleftBigg
n
summationdisplay
j=r
C
j
T
j
parenrightBiggparenleftBigg
n
summationdisplay
j=1
T
1/3
j
C
j
T
j
parenrightBigg
(4.14)
25
since T
1
?????T
n
.Now
?
X
r?1
X
L
r?1
=
?
?
T
1/3
r?1
K
prime
summationtext
r?1
j=1
T
1/3
j
C
j
T
j
?
?
?
?
summationtext
n
j=1
T
1/3
j
C
j
T
j
T
1/3
r?1
K
?
?
=
K
prime
K
?
?
summationtext
n
j=1
T
1/3
j
C
j
T
j
summationtext
r?1
j=1
T
1/3
j
C
j
T
j
?
?
=
bracketleftBigg
K?
summationtext
n
j=r
C
j
T
j
K
bracketrightBigg
?
?
summationtext
n
j=1
T
1/3
j
C
j
T
j
summationtext
n
j=1
T
1/3
j
C
j
T
j
?
summationtext
n
j=r
T
1/3
j
C
j
T
j
?
?
=
K
parenleftBig
summationtext
n
j=1
T
1/3
j
C
j
T
j
parenrightBig
?
parenleftBig
summationtext
n
j=r
C
j
T
j
parenrightBigparenleftBig
summationtext
n
j=1
T
1/3
j
C
j
T
j
parenrightBig
K
parenleftBig
summationtext
n
j=1
T
1/3
j
C
j
T
j
parenrightBig
?K
parenleftBig
summationtext
n
j=r
T
1/3
j
C
j
T
j
parenrightBig
By (4.14) we have
?
X
r?1
?X
L
r?1
?1, contradicting the fact that
?
X
r?1
> 1.
Proposition 4.7 leads to an algorithm for the determination of r?1 and thus
?
X. In general we may not be able to locate r?1 right after the first application of
Lagrange Multiplier (see Proposition 4.8). However we can repeat the process and
eventually will get to the value of r?1. The following is an algorithm to compute
the solution to Problem (4.1).
The algorithm: Arrange T
1
,...,T
n
so that T
1
?????T
n
.
Step 1: By the method of Lagrange Multiplier, i.e., from (4.8)
X
(1)
i
:= X
L
i
=
T
1/3
i
K
summationtext
n
j=1
T
1/3
j
C
j
T
j
(4.15)
26
so that X
(1)
1
?????X
(1)
n
by Theorem 4.6. Notice that
X
(1)
i
=
T
1/3
i
K
summationtext
n
j=1
T
1/3
j
C
j
T
j
<
T
i
C
i
,i=1,...,n, (4.16)
X
(1)
1
?
K
summationtext
n
j=1
parenleftBig
T
j
T
1
parenrightBig
1/3
C
j
T
j
?
K
summationtext
n
j=1
C
j
T
j
?1 (4.17)
Then consider the cases:
(a) if X
(1)
n
?1, then
?
X =(X
(1)
1
,...,X
(1)
n
) by Theorem 4.6.
(b) if X
(1)
n
< 1, i.e., there is 0 ? s
1
? n?1 such that X
(1)
1
?????X
(1)
s
1
> 1 ?
X
(1)
s
1
+1
?????X
(1)
n
then by (4.13) r?1?s
1
, where
?
X
1
?????
?
X
r?1
> 1=
?
X
r
=
?
X
r+1
=???=
?
X
n
and r?1 is obviously fixed and to be determined. In other words, we know
that
?
X
s
1
+1
=???=
?
X
n
=1.
Remark: s
0
= 0 means that X
L
i
?1 for all i =1,...,n. But it actually means that
X
L
i
= 1 for all i because g(X
L
)=K.
Then the problem is reduced to the following:
Find X
(2)
1
?????X
(2)
s
1
?1(=X
(2)
s
1
+1
=???= X
(2)
n
), which yields
min
s
1
summationdisplay
i=1
C
i
X
2
i
+
n
summationdisplay
i=s
1
+1
C
i
, or simply min
s
1
summationdisplay
i=1
C
i
X
2
i
subject to the constraints
27
1.
summationtext
s
1
i=1
X
i
C
i
T
i
= K
(2)
, where
K
(2)
:= K?
n
summationdisplay
i=s
1
+1
C
i
T
i
and obviously
summationtext
s
1
i=1
C
i
T
i
?K
(2)
(we may set K
(1)
:= K in Step 1).
2. 1?X
i
?
T
i
C
i
, i =1,...,s
1
.
Step 2: Apply the method of Lagrange Multiplier to yield
X
(2)
i
=
T
1/3
i
K
(2)
summationtext
s
1
j=1
T
1/3
j
C
j
T
j
,i=1,...,s
1
(4.18)
so that X
(2)
1
?????X
(2)
s
1
. Then consider the cases:
(a) if X
(2)
s
1
?1, then
?
X =(X
(2)
1
,...,X
(2)
s
1
,1,...,1).
(b) if X
(2)
s
1
< 1, i.e., there is 0 ? s
2
? s
1
?1 such that X
(2)
1
?????X
(2)
s
2
> 1 ?
X
(2)
s
2
+1
?????X
(2)
s
1
, then the problem is reduced to the following problem:
Find X
(3)
1
?????X
(3)
s
2
?1(=X
(3)
s
2
+1
=???= X
(3)
s
1
=???= X
(3)
n
) with s
2
<s
1
,
which yields
min
s
2
summationdisplay
i=1
C
i
X
2
i
+
n
summationdisplay
i=s
2
+1
C
i
, or simply min
s
2
summationdisplay
i=1
C
i
X
2
i
subject to the constraints
28
1.
summationtext
s
2
i=1
X
i
C
i
T
i
= K
(3)
, where
K
(3)
:= K?
n
summationdisplay
i=s
2
+1
C
i
T
i
and obviously
summationtext
s
2
i=1
C
i
T
i
?K
(3)
.
2. 1?X
i
?
T
i
C
i
, i =1,...,s
2
.
Step 3: ......
The process continues as long as some of iterated X?s is less than 1 and certainly
will stop since there are finitely many X?s. The process stops at the kth step if all
X
(k)
1
?????X
(k)
s
k
?1 and
?
X =(X
(k)
1
,...,X
(k)
s
k
,1...,1).
The following describe a general property among consecutive iterations. The
proof is similar to that of (4.13) and is omitted. It implies that in general one
iteration is not enough in order to determine
?
X and r?1.
Proposition 4.8. Suppose T
1
?????T
n
. For each 1?i?s
lscript
(X
(lscript)
i
> 1), X
(lscript+1)
i
<
X
(lscript)
i
. In other words, each needed iteration will decrease X?s which are larger than 1.
Example 4.9. For instance, take a look at the following experimental example.
Suppose
T := (T
1
,T
2
,T
3
,T
4
) = (25391,14905,12913,5758)
and
C := (C
1
,C
2
,C
3
,C
4
) = (4616,6073,575,515).
29
Notice that T
1
?T
2
?T
3
?T
4
.
ith iteration X
(i)
1
X
(i)
2
X
(i)
3
X
(i)
4
1 1.23 1.03 0.99 0.75
2 1.19 0.99 1 1
3 1.18 1 1 1
So three iterations are needed to get
?
X =(1.18,1,1,1) and r?1=1.
Remark 4.10. In case Theorem 4.6(2) occurs,
Y := (X
L
1
,...,X
L
s
1
,1,...,1)negationslash?R
since
summationtext
n
i=1
Y
i
C
i
T
i
>
summationtext
n
i=1
X
L
i
C
i
T
i
= K and because of (4.4). So Y is not a solution
to the minimization problem (4.1). Thus there is no contradiction though f(Y ) <
f(X
L
)(?min
X?R
f(X)). Similar conclusion can be reached for consecutive iterations
with respect to Proposition 4.8.
The following is another description from a top-down view. Similar algorithm
can be derived from it.
Theorem 4.11. Let T
1
?????T
n
. Suppose that the minimization problem (4.5)
has solution
?
X =(
?
X
1
,...,
?
X
n
)?R.
1. X
L
n
?1 if and only if
?
X
n
?1. In this case,
?
X = X
L
is the unique solution to
the minimization problem (4.1).
30
2. If X
L
n
< 1, then the unique solution
?
X
1
?????
?
X
r?1
>
?
X
r
=???=
?
X
n
= 1 has
r where r?1 is the first integer i such that
T
1/3
i
K
(i)
summationtext
i
j=1
T
1/3
j
C
j
T
j
> 1 and
T
1/3
i+1
K
(i+1)
summationtext
i+1
j=1
T
1/3
j
C
j
T
j
?1,
where K
(k)
:= K?
summationtext
n
j=k+1
C
j
T
j
.
To illustrate, let us consider the special case n =2.
1. n = 2: We first arrange T?s so that T
1
?T
2
.
The Lagrange Multiplier yields X
L
i
=
T
1/3
i
K
summationtext
2
j=1
T
1/3
j
C
j
T
j
, i =1,2. Notice that
X
L
1
=
T
1/3
1
K
T
1/3
1
C
1
T
1
+ T
1/3
2
C
2
T
2
=
K
C
1
T
1
+(
T
2
T
1
)
1/3
C
2
T
2
?
KT
1
C
1
?
T
1
C
1
, (4.19)
X
L
2
=
T
1/3
2
K
T
1/3
1
C
1
T
1
+ T
1/3
2
C
2
T
2
=
K
(
T
1
T
2
)
1/3
C
1
T
1
+
C
2
T
2
?
KT
2
C
2
?
T
2
C
2
(4.20)
X
L
1
=
K
C
1
T
1
+(
T
2
T
1
)
1/3
C
2
T
2
?
K
C
1
T
1
+
C
2
T
2
?1. (4.21)
by K ?
C
1
T
1
+
C
2
T
2
. Moreover
X
L
1
?X
L
2
(4.22)
because T
1
?T
2
.
So it remains to check whether X
L
2
?1.
31
We take another approach and treat f as a function of X
2
. Since
C
1
X
1
T
1
+
C
2
X
2
T
2
=
K, X
2
=(K?
C
1
X
1
T
1
)
T
2
C
2
so that
f(X)=f(X
2
)=
C
1
[(K?
C
2
X
2
T
2
)
T
1
C
1
]
2
+
C
2
X
2
2
.
There are two singularities on R, namely X
2
= 0, i.e., (X
1
,X
2
)=(
KT
1
C
1
,0), and
X
2
=
KT
2
C
2
, i.e, (X
1
,X
2
)=(0,
KT
2
C
2
). Notice that [1,(K?
C
1
T
1
)
T
2
C
2
] is the range for
X
2
, according to the restrictions.
Consider f(X
2
):
1. On the open interval (??,0), f is decreasing from?to 0.
2. On the open interval (0,
KT
2
C
2
), f has a minimum at
X
L
2
=
K
(
T
1
T
2
)
1/3
C
1
T
1
+
C
2
T
2
?
KT
2
C
2
?
T
2
C
2
i.e.,
(X
L
1
,X
L
2
)=(
K
C
1
T
1
+(
T
2
T
1
)
1/3
C
2
T
2
,
K
(
T
1
T
2
)
1/3
C
1
T
1
+
C
2
T
2
)
by the method of Lagrange Multiplier.
3. On the open interval (
KT
2
C
2
,?), f is decreasing from?to 0.
Notice that the interval [1,(K?
C
1
T
1
)
T
2
C
2
] is a subset of the open interval (0,
KT
2
C
2
). Recall
1 ? X
L
1
?
T
1
C
1
, X
L
2
?
T
2
C
2
,
C
1
X
1
T
1
+
C
2
X
2
T
2
= K, and X
L
1
? X
L
2
. So we only have the
following two cases:
32
Case 1: X
L
2
?1. Then (X
L
1
,X
L
2
) provides the min.
Note: Since
C
1
X
L
1
T
1
+
C
2
X
L
2
T
2
= K, X
L
1
?1 implies
C
2
X
L
2
T
2
= K?
C
1
X
L
1
T
1
?K?
C
1
T
1
so
that
X
L
2
?(K?
C
1
T
1
)
T
2
C
2
.
Similarly if X
L
2
?1, then
X
L
1
?(K?
C
2
T
2
)
T
1
C
1
.
Case 2: X
L
2
< 1. The minimum must be obtained on the boundary ?R.Now
d
dX
2
f(X
2
)=
2C
1
C
2
T
2
(K?
C
2
X
2
T
2
)
3
(
T
1
C
1
)
2
?
2C
2
X
3
2
=
2C
2
(K?
C
2
X
2
T
2
)
3
X
3
2
bracketleftbigg
C
3
1
T
2
1
T
2
?(K?
C
2
X
2
T
2
)
3
bracketrightbigg
Recall that the range for X
2
is [1,(K?
C
1
T
1
)
T
2
C
2
]. For 1?X
2
<
T
2
C
2
, since X
L
2
< 1,
i.e.,
C
1
T
1
parenleftbigg
T
1
T
2
parenrightbigg
1/3
+
C
2
T
2
>K
so that
C
1
T
1
parenleftbigg
T
1
T
2
parenrightbigg
1/3
+
C
2
X
2
T
2
?
C
1
T
1
parenleftbigg
T
1
T
2
parenrightbigg
1/3
+
C
2
T
2
>K.
Hence
C
3
1
T
2
1
T
2
?(K?
C
2
X
2
T
2
)
3
> 0
33
so that
d
dX
2
f(X
2
) > 0. In other words, f is increasing on the interval. Hence
the minimum occurs at X
2
=1.
The following is a typical picture for n = 2 case.
?1 0 1 2 3 4 5
0
50
100
150
200
250
300
350
400
450
500
x2
f(x2)
C1 = 9 T1 = 19; C2 = 8 T2 = 28; (k ? (C1/T1)*T2)/C2 = 1.2416
Figure 4.1: Plot of f(X
2
)
34
4.2 Scaling Factor Assignment Algorithm
On the basis of above proofs, we restate our solution as follows:
1. Sort time period T
i
in descending order.
2. Compute scaling factor X
i
by the method of Lagrange multiplier (X
1
?????
X
n
). If all of them are greater than or equal to 1, then they are the solution.
Then we are done. Otherwise, there are some X?s less than 1.
3. Then find those X
i
less than or equal to 1 and set them to 1, i.e., X
r
=1,
r = i,i+1,...,n.
4. Repeat 2 and 3 until all X
i
?1.
Table 4.1 is the ScaleFactors pseudocode according to the above algorithm.
35
1 Procedure ScaleFactors
2 Input: task set t ={<C
i
,T
i
>}of n elements
3 Output: Scale factor set{X
i
}of n elements
4 Sort t in descending T
i
5 K ?n(2
1/n
?1)
6 r?n +1
7 repeat
8 denom?0,K
prime
?K
9 for j?0tor?1 do denom?denom + T
1/3
j
?C
j
/T
j
10 for q?r to n do K
prime
?K
prime
?C
q
/T
q
11 trunc?false
12 for i?1tor?1 do
13 X
i
?T
1/3
i
?K
prime
/denom
14 if (X
i
?1) then
15 r?i+1,trunc?true
16 break
17 end if
18 end for
19 until (not trunc)or(r =1)
20 for i?r to n do X
i
?1
21 end ScaleFactors
Table 4.1: Procedure ScaleFactors Pseudocode
Let us analyze the ScaleFactors algorithm.
? Sorting task set t of n elements can take O(nlgn) time if we use Quicksort
algorithm.
? Initializing variable K and r at the beginning takes O(1) time, respectively.
36
? There are three nested loops. The body of the first and second inner loops,
controlled by counter j and q, are executed together n times, for j =0,...,r?1
and q = r,...,n; that is, O(n) times.
? Similarly, the body of the third inner loop, which is controlled by counter i,is
executed r?1 times, depending on the current value of the variable r. Since r
is initialized with n + 1, and is never greater than n + 1, which implies that in
each iteration, the body of the for loop executes n+1 times at most. Therefore,
the statements in the second inner loop also takes O(n) times.
? The outer loop is not terminated until the variable trunc is false or r is set to
be 1. The best case occurs when the variable trunc never be assigned true or
r = 1, which takes O(1) time. However, the worst case occurs when r = n?1
in the prior iteration each time and takes O(n) times.
In sum, the running time of ScaleFactors algorithm is given O(nlgn) + O(1) +
O(n
2
) = O(n
2
).
37
However, since our goal is to find (
?
X
1
,
?
X
2
,...,
?
X
r?1
) for some r =1,...,n,ifwe
can get the first X less than 1 sooner in each iteration, we can improve the complexity
of the algorithm. Using divide-and-conquer paradigm in Table 4.2 to arrive at
?
X
r?1
takes O(nlgn) run-time and the complexity will be reduced to O(nlgn).
1 Procedure ScaleFactorsDnC
2 Input: task set t ={<C
i
,T
i
>}of n elements
3 Output: Scale factor set{X
i
}of n elements
4 Sort t in descending T
i
5 K ?n(2
1/n
?1)
6 first?0,last?n
7 while (first <= last) do
8 r?last
9 mid?(first + last)/2
10 denom?0,K
prime
?K
11 for j?0tor?1 do denom?denom + T
1/3
j
?C
j
/T
j
12 for q?r to n do K
prime
?K
prime
?C
q
/T
q
13 X
mid
?T
1/3
mid
?K
prime
/denom
14 if (X
mid
> 1) then
15 first?mid +1
16 else
17 last?mid?1
18 end if
19 end while
20 for i?1tor?1 do X
i
?T
1/3
i
?K
prime
/denom
21 for i?r to n do X
i
?1
22 end ScaleFactorsDnC
Table 4.2: Procedure ScaleFactorsDnC Pseudocode
38
4.3 Revised Scaling Factor Assignment Algorithm
If we look back into the mathematical form of our solution from (4.11), we would
find out that
?
X
i
> 1 only when i?r?1. In other words,
?
X
i
?1 when i>r?1
according to Proposition 4.7. Therefore, we can also get
?
X
r?1
more efficiently if we
compute X?s backwards. We adjust above algorithm and list its pseudocode in Table
4.3.
1 Procedure ScaleFactorsBackward
2 Input: task set t = <C
i
,T
i
> of n elements
3 Output: Scale factor set X
i
of n elements
4 Sort time periods in descending order.
5 K ?n(2
1/n
?1),denom?0
6 for i?1ton do
7 denom?denom + T
1/3
i
?C
i
/T
i
8 end for
9 for i?n down to 1 do
10 X
i
= T
1/3
i
?K/denom
11 if X
i
?1 then
12 X
i
?1
13 K ?K?C
i
/T
i
14 denom?denom?T
1/3
i
?C
i
/T
i
15 r?i
16 end if
17 end for
18 end ScaleFactorsBackward
Table 4.3: Procedure ScaleFactorsBackward Pseudocode
? As well, sorting task set t of n elements can take O(nlgn) time if we use
Quicksort algorithm.
39
? Initializing variables K and denom takes O(1) time.
? The body of the first inner loop, controlled by counter i, is executed n times,
for i =0,...,n?1, that is, O(n) times.
? Similarly, the body of the second inner loop, which is controlled by counter i,is
executed n times, for i = n?1,...,0. Therefore, the statements in the second
inner loop takes O(n) times.
The ScaleFactorsBackwards then takes O(nlgn) + O(1) + O(n) + O(n) =
O(nlgn) times.
If we only consider the complexity of computing
?
X
i
, the ScaleFactorsBackwards
takes only O(n) times to arrive at
?
X
r
, yet the ScaleFactors takes O(n
2
) and Scale-
FactorsDnC takes O(nlgn) times. Therefore, the ScaleFactorsBackwards algorithm
is proven to have lowest complexity than the other two.
40
Chapter 5
Result and Analysis
Consider the following task set:
Task Computation Time, C Time Period, T Utilization, U Priority
a 3 8 0.38 3
b 3 10 0.30 2
c 1 14 0.07 1
Table 5.1: Example Task Set A
Table 5.1 contains three tasks that are given priorities through RM algorithm.
Computation time and time period are measured in the number of clock cycles per
second. Note that priority is presented in integer in descending order. Namely, the
higher the integer, the greater the priority.
Task a is given the highest priority since it has the shortest time period. Their
combined CPU utilization is 0.75 and passes RM test, which equals 0.78 from (3.1).
41
Figure 5.1 is the time-line for task set A before scaling. Y -coordinate represents
clock frequency of the processor, and x-coordinate means task computation time. a
2
is the deadline of task a; likewise, b
2
, c
2
are the deadlines of tasks b and c. Notice
that all the three tasks finish their execution before their deadlines.
Figure 5.1: Time-line for Task Set A before scaling, where computation times are
specified at the maximum CPU frequency
After scaling all tasks computation time according to each corresponding scaling
factor from our solution, table 5.2 and table 5.3 show the results of energy reduction.
Task Computation Time, C Utilization, U CPU Frequency Energy
a 3 0.38 1.0
b 3 0.30 1.0 7
c 1 0.07 1.0
Table 5.2: Energy Consumption before Scaling for Task Set A
We suppose the maximum CPU frequency is 1.0. Energy in Tables 5.2 and 5.3
are calculated from (4.1)?CPU frequency.
42
Task Computation Time, C Utilization, U CPU Frequency, f Minimum Energy
a 3 0.38 1
b 3.21 0.32 0.94 6.35
c 1.19 0.08 0.84
Table 5.3: Energy Consumption after Scaling for Task Set A
As we see in Table 5.3, the CPU frequency and computation time of each task
change individually. For instance, CPU frequency and computation time of task a
remain the same after scaling, yet in task b, CPU frequency is slightly reduced to
0.94 and its computation time extends from 3 till 3.21. The CPU frequency of task
c is reduced even more. This phenomenon just illustrates the algorithm suggested in
Chapter 4, the longer the time period of a task, the greater its scaling factor.
The combined CPU utilization after scaling just reaches 0.78 (the upper bound
of RM test), but the scaled task set saves 17.79% energy.
43
The scaled time-line is depicted in Figure 5.2. Note that not only both CPU
frequencies of tasks b and c are reduced, but slack time is also shortened. However,
each task still complete its execution before deadline. Therefore, we prove that the
task set can still be scheduled by RM algorithm.
Figure 5.2: Scaled Time-line for Task Set A
Consider another case:
Task Computation Time, C Time Period, T Utilization, U Priority
a 2 14 0.14 1
b 1 10 0.10 3
c 3 12 0.25 2
Table 5.4: Example Task Set B
Task b in table 5.4 is given the highest priority and task a the lowest. Their
combined CPU utilization is 0.49, much less than 0.78.
The time-lime scheduled by RM algorithm for task set B is as follows:
44
Figure 5.3: Time-line for Task Set B
The slack time in task set B is much longer than in task set A, since its combined
CPU utilization is much less than in task set A. Task b has the greatest priority and
executes first. Again, we assume that every task executes at the maximum CPU
frequency 1.0.
Table 5.5 and Table 5.6 show the result of power reduction after scaling. The
scaled task set saves almost 60.22 % energy.
Task Computation Time, C Utilization, U CPU Frequency, f Energy
a 2 0.14 1.0
b 1 0.10 1.0 6
c 3 0.25 1.0
Table 5.5: Energy Consumption before Scaling for Task Set B
45
Task Computation Time, C Utilization, U CPU Frequency, f Minimum Energy
a 3.32 0.24 0.60
b 1.58 0.16 0.67 2.39
c 4.45 0.37 0.63
Table 5.6: Energy Consumption after Scaling for Task Set B
Figure 5.4: Scaled Time-line for Task Set B
The time-line for task set B after CPU frequency scaling is in Figure 5.4.
The results show that the percentage of power saving varies dramatically because
this approach is independent of task sets. In general, the more difference between
CPU combinedutilizationand the upper bound of RM test, the more power reduction.
We should also consider the effect of the overhead of voltage switching because
voltage switching results in overhead in that CPU cannot execute any task [15]. We
could further incorporate our approach with different dynamic reclaiming algorithms.
We will extend this work in the real life examples as our future work.
46
Chapter 6
Summary
In this thesis, we derive a minimum energy function for RM task sets and solve
this problem. Different from other approaches, our solution goes straightforward and
takes account of RM test united with power dissipation equation in CMOS circuits.
By using the method of Lagrange Multiplier, we are able to abtain each scaling factor
for corresponding task execution time.
We assume the processor can vary its supply voltage dynamically and ignore the
threshold voltage and voltage switching overhead. We also assume that every slack
time can be fully filled by extending each task?s WCET according to their specified
scaling factors.
For a task set containing n tasks, the scaling factors
?
X
i
can be obtained by
iteratively using the following formula to test whether the scaling factor
?
X
i
is less
than or equal to 1,
?
X
i
=
T
1/3
i
K
prime
summationtext
r?1
j=1
T
1/3
j
C
j
T
j
,i=1,...,r?1, for some r =1,...,n
K
prime
:= n(2
1/n
?1)?
n
summationdisplay
i=r
C
i
T
i
.
47
The experiment results show that minimum power consumption can be exactly
accomplished in the presence of all
?
X
i
greater than or equal to 1. At the same time,
the RM task set is still guaranteed to meet all deadlines. Our solution is proven not
to violate tasks deadlines.
48
Bibliography
[1] A. Burns and A. Wellings, Real Time Systems and Programming Languages: Ada
95, Real-Time Java and Real-Time C/POSIX, 3rd ed., Addison Wesley, 2001.
[2] A. P. Chandrakasan, S. Sheng and R. W. Brodersen. ?Low-Power CMOS Digital
Design,? IEEE Journal of Solid-State Circuits, vol. 35, pp. 473-484, Apr. 1992.
[3] A. Chilambuchelvan, S. Saravanan and J. R. P. Perinbam. ?A Simulation Soft-
ware Development for Performance Analysis of DVS Algorithm for Low Power
Embedded System,? ARPN Journal of Engineering and Applied Sciences, vol.
2, pp. 27-31, Aug. 2007.
[4] F. Gruian, ?Hard Real-Time Scheduling for Low-Energy Using Stochastic Data
and DVS Processors,? In Proc. the 2001 International Symposium on Low Power
Electronics and Design, 2001, pp. 46-51.
[5] A. Hakan, R. Melhern, D. Mosse, P. M. Alvarez, ?Dynamic and Aggressive
Scheduling Techniques for Power-Aware Real-Time Systems,? Real-Time Sys-
tems, pp. 225-232, Jun. 2001.
[6] R.A. Horn and C.R. Johnson, Matrix Analysis, Cambridge University Press,
1985.
[7] W. Kim, D. Shin, H. S. Yun, J. Kim, and S. L. Min, ?Performance Comparison of
Dynamic Voltage Scaling Algorithms for Hard Real-Time Systems,? In Proc. the
8th IEEE Real-Time and Embedded Technology and Applications Symposium,
2002, pp. 219-218.
[8] Y. H. Tsai, K. Wang, and J. M. Chen, ?A Deferred-Workload-based Inter-Task
Dynamic Voltage Scaling Algorithm for Portable Multimedia Devices,? In Proc.
the 2007 international conference on Wireless communications and mobile com-
puting, 2007, pp. 677-682.
[9] C. L. Liu and J. W. Layland, ?Scheduling Algorithms for Multiprogramming
in a Hard Real-Time Environment,? Journal of the ACM (JACM), vol. 20, pp.
46-61, Jan. 1973.
49
[10] Y. Liu and A. K. Mok, ?An Integrated Approach for Applying Dynamic Voltage
Scaling to Hard Real-Time Systems,? In Proc. the 9th IEEE Real-Time and
Embedded Technology and Applications Symposium, 2003, pp. 116-123.
[11] A. Manzak and C. Chakrabarti, ?Variable Voltage Task Scheduling Algorithms
for Minimizing Energy/Power,? In Proc. the 2001 International Symposium on
Low Power Electronics and Design, 2001, pp.279-282.
[12] P. Pillai and K. G. Shin, ?Real-Time Dynamic Voltage Scaling for Low-Power
Embedded Operating System,? In Proc. the 18th ACM Symposium on Operating
Systems Principles, 2001, pp. 89-102.
[13] D. Shin, J. Kim and S. Lee. ?Intra-Task Voltage Scheduling for Low-Energy
Hard Real-Time Applications,? IEEE Design and Test of Computers, vol. 18,
pp. 20-30, Mar. 2001
[14] Y. Shin and K. Choi, ?Power Conscious Fixed Priority Scheduling for Hard
Real-Time Systems,? In Proc. the 36th ACM/IEEE Conference on Design Au-
tomation, 1999, pp. 134-139.
[15] V. Swaminathan and K. Chakrabarty, ?Investigating the Effect of Voltage-
Switching on Low-Energy Task Schedulingin Hard Real-Time Systems,? In Proc.
the 2001 Conference on Asia South Pacific Design Automation, 2001, pp. 251 -
254.
[16] O. S. Unsal and I. Koren, ?System-Level Power-Aware Design Techniques in
Real-Time Systems,? In Proc. the IEEE, 2003, pp. 1055 - 1069.
[17] M. Weiser, B. Welch, A. Demers, and S. Shenker, ?Scheduling for Reduced CPU
Energy,? In Proc. the 1st USENIX conference on Operating Systems Design and
Implementation, 1994, pp. 13-23.
[18] X. L. Zhong and C. Z. Xu, ?Energy-Aware Modeling and Scheduling of Real-
Time Tasks for Dynamic Voltage Scaling,? In Proc. the 26th IEEE International
Real-Time Systems Symposium, 2005.
50
Appendices
1 function y = MinEnergy(T,C)
...
9 [T,NDX] = sort(T,?descend?); % sort T in descending order
10 C = C(NDX); % sort C accordingly but not necessarily in descending order
11 unsort(NDX) = 1:n; % see http://blogs.mathworks.com/loren/?p=104
12 % inverse sort
13 K=n?(2
?
(1/n)?1); % upper bound of RM
14 C1 = C;
15 T1 = T;
16 i = 1;
17 while i <=n
18 D = sum((T1.
?
(1/3)).*(C1./T1)); % Denominator
19 X(i) = (T1(i))
?
(1/3)*K/D;
20 if X(i) <=1
21 X(i) = 1;
22 K=K?sum((C1(i:n))./(T1(i:n)));
23 T1 = T1(1:i-1); C1 = C1(1:i-1);
25 n=i-1;i=1;
27 else
28 i=i+1;
29 end
30 end
31 unsort;
32 X=X(unsort) % print the real X
33 end
34 end
51