Minimum Power Consumption for Rate Monotonic Tasks Except where reference is made to the work of others, the work described in this thesis is my own or was done in collaboration with my advisory committee. This thesis does not include proprietary or classified information. Chiao Ching Huang Certificate of Approval: Cheryl Seals Associate Professor Computer Science and Software Engi- neering Sanjeev Baskiyar, Chair Associate Professor Computer Science and Software Engi- neering Tin-Yau Tam Professor Mathematics and Statistics George T. Flowers Dean Graduate School Minimum Power Consumption for Rate Monotonic Tasks Chiao Ching Huang A Thesis Submitted to the Graduate Faculty of Auburn University in Partial Fulfillment of the Requirements for the Degree of Master of Science Auburn, Alabama December 19, 2008 Minimum Power Consumption for Rate Monotonic Tasks Chiao Ching Huang Permission is granted to Auburn University to make copies of this thesis at its discretion, upon the request of individuals or institutions and at their expense. The author reserves all publication rights. Signature of Author Date of Graduation iii Vita Chiao Ching Huang was born in Tainan City, Taiwan, on March 19, 1978. She graduated from Feng-Chia Universitywith a Bachelor of Science degree in 2000. After working for Chinese PetroleumCorp. for almost fiveyears, in 2005 fall, she enteredthe graduate program in the Department of Computer Science and Software Engineering at Auburn University. While in pursuit of her Master of Science degree, she worked as a graduate assistant for Auburn University Graduate School under Deans George Flowers and Joe Pittman. iv Thesis Abstract Minimum Power Consumption for Rate Monotonic Tasks Chiao Ching Huang Master of Science, December 19, 2008 (B.S., Feng Chia University, 2000) 62 Typed Pages Directed by Sanjeev Baskiyar Embedded computer systems are widely used in modern life and their use is ex- panding. One of the typical constraints in embedded systems, particularly in stand- alone devices, is their low power capacity. One way to expand the lifetimeof battery is to reduce its power consumption; because of the quadratic relationship between power consumption in CMOS circuits and CPU voltage, researchers now can achieve power reduction by scaling down its supply voltage by applying Dynamic Voltage Scaling (DVS). However, reducing supply voltage also slows down CPU speed since supply voltage has a proportional relationship with clock frequency of processor, namely, CPU speed. As a result, DVS succeeds at the cost of system performance. However, in a Real-Time embedded environment, especially in Hard Real-Time embedded Sys- tems, timing constraint is a critical element that cannot be ignored. Therefore, it is difficult to balance the power savings and system throughput so that all tasks will still complete before their deadlines. v In this thesis, we focus on tasks scheduled by Rate Monotonic (RM) algorithm in a Hard Real-Time embedded environment. We derive an equation for scaling worst case execution time (WCET) of each task and expanding each WCET with differ- ent factors according to corresponding task computation time until slack times are fully occupied. Different from other approaches, we combine the power consumption equation with the constraint of Rate Monotonic schedulability test (RM test). From the result of this solution, we find the minimum power consumption, at which the RM task set can still pass the RM test and guarantee all tasks will meet deadlines. Our approach can be categorized as an off-line Intra-Task Dynamic Voltage Scaling (IntraDVS). vi Acknowledgments I would like to thank God for giving me an opportunity to attend graduate school and pursue a master?s degree. I would also like to thank Him for the immense strength, constant support and unconditional love. I would like to extend my sincere appreciations to the members of my master?s committee for their knowledge, time and assistance. I am thankful to my academic advisor, Dr. Sanjeev Baskiyar, for his support, pa- tience, encouragement and priceless guidance. His enthusiasm, creativity, and careful review have helped me accomplish an enormous task. I am especially thankful to Dr. Tin-Yau Tam for guiding me through the most critical part of my research. The math proofs and the Matlab program were written under his valuable directions. I would like to take this opportunity to thank the Graduate School for providing me with an assistantship and financial support. Lastly, I thank my beloved family and friends for their endless love, support, patience and encouragement. vii Stylemanual or journal used Journal ofApproximationTheory (together with the style known as ?aums?). Bibliograpy follows van Leunen?s A Handbook for Scholars. Computer software used The document preparation package T E X (specifically L A T E X) together with the departmental style-file aums.sty. viii Table of Contents List of Figures x List of Tables xi 1 Introduction 1 1.1 Current Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Motivation................................. 2 1.3 ThesisOutline............................... 2 2 Dynamic Voltage Scaling 3 2.1 CMOS Circuit Design . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 DVS Algorithm with DVS Processor . . . . . . . . . . . . . . . . . . 5 2.3 Problems.................................. 8 3 Formalization 9 3.1 Rate Monotonic Schedulability Test . . . . . . . . . . . . . . . . . . . 9 3.2 Power Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.3 Energy Consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.4 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4 Solution to the optimization problem 13 4.1 Mathematical Statements and Proofs . . . . . . . . . . . . . . . . . . 13 4.2 Scaling Factor Assignment Algorithm . . . . . . . . . . . . . . . . . . 35 4.3 Revised Scaling Factor Assignment Algorithm . . . . . . . . . . . . . 39 5 Result and Analysis 41 6 Summary 47 Bibliography 49 Appendices 51 ix List of Figures 4.1 Plot of f(X 2 ) ............................... 34 5.1 Time-line for Task Set A before scaling, where computation times are specified at the maximum CPU frequency . . . . . . . . . . . . . . . 42 5.2 Scaled Time-line for Task Set A . . . . . . . . . . . . . . . . . . . . . 44 5.3 Time-line for Task Set B . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.4 Scaled Time-line for Task Set B . . . . . . . . . . . . . . . . . . . . . 46 x List of Tables 4.1 Procedure ScaleFactors Pseudocode . . . . . . . . . . . . . . . . . . . 36 4.2 Procedure ScaleFactorsDnC Pseudocode . . . . . . . . . . . . . . . . 38 4.3 Procedure ScaleFactorsBackward Pseudocode . . . . . . . . . . . . . 39 5.1 Example Task Set A . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.2 Energy Consumption before Scaling for Task Set A . . . . . . . . . . 42 5.3 Energy Consumption after Scaling for Task Set A . . . . . . . . . . . 43 5.4 Example Task Set B . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 5.5 Energy Consumption before Scaling for Task Set B . . . . . . . . . . 45 5.6 Energy Consumption after Scaling for Task Set B . . . . . . . . . . . 46 xi Chapter 1 Introduction 1.1 Current Problem Maturity of wireless technology and mobile markets have dramatically changed lifestyles of people. More and more consumers depend on wireless or portable devices, such as cellular phone, personal digital assistant (PDA), digital camera or global positioning system (GPS). With the greatly growing popularity of handheld devices, power supply or battery lifetimehas always been one of the major determinants in the purchase decision. For decades, scientific researchers have sought to find the optimal solution to save power in handhelds and produce more energy efficient power supplies. As far as Real-Time embedded systems are concerned, power consumption be- comes an even bigger issue. One of the well-known characteristics of a Real-Time Operating System is its time constraint. In a Soft Real-Time system, missed dead- lines may not affect the whole system?s functionality. However, a missed deadline in a Hard Real-Time system could lead to a non-recoverable system failure. Reducing power dissipation often means sacrificing system performance. That is, tasks have the potentials of late completions, which cannot be tolerated in a Hard Real-Time system [12]. 1 1.2 Motivation This thesis focuses on Rate Monotonic (RM) schedulable tasks for Hard Real- Time embedded systems. The task model is preemptive and static priority driven. The approach is to expand computation time of each task until its combined CPU utilization approximately reaches the upper bound of Rate Monotonic schedulability test (RM test) [9]; thus, lowering CPU speed and supplied voltage, yet guaranteeing none of the tasks misses its deadline. With the method of Lagrange Multiplier, the approach deduces a specific formula and iteration yielding the exact scaling factor for each task. Then, the computation time of each corresponding task is increased by applying this scaling factor. Consequently, the minimum power consumption can be achieved and all the tasks will be processed within their time constraints. 1.3 Thesis Outline The remainder of this thesis is organized as follows. Chapter 2 reviews recent studies. Specifically, different kinds of DVS approaches and their challenges will be introduced and analyzed. Chapter 3 gives a detailed explanation for the formulation and deduction of the problem. In Chapter 4, we discuss mathematical part and explain the solution to the problem described in Chapter 3. We provide a complete proof and present the algorithm inferred from this study. Chapter 5 includes some experiments and observations during this research and discusses issues regarding the result and suggestions for future work. Chapter 6 concludes this study. 2 Chapter 2 Dynamic Voltage Scaling A large number of research activities have been conducted in the past decades to achieve power reduction. In this chapter, we introduce Dynamic Voltage Scaling (DVS) techniques and present some of its strategies. Before we start, we explain some terms in scheduling Hard Real-Time tasks. Worst Case Execution Times (WCETs) As implied by the name, worst case execution time (WCET) is the estimated maximum possible run-time for a request to complete its task without missing its deadline. Just as W. Kim et al.[7] explained, ?In scheduling Hard Real-Time tasks, in order to guarantee the timing constraint of each task, the execution times of tasks are usually assumed to be the worst case execution times (WCETs).? WCET is often used as a standard design scheme to schedule Hard Real-Time tasks. Slack Time Slack times can be classified as static slack times and dynamic slack times [7]. Static slack time is defined as a time frame between the moment a task finishes its execution and its next invocation. Off-line (or static) approaches, e.g. static voltage scaling, path-based method, stochastic method and maximum constant speed should 3 fully utilize these static slack times known in advance by reducing CPU frequency and consequently reducing supply voltage. Dynamic slack time is the difference between actual execution time and WCET 1 . The actual task execution time is unpredictable because of processor speed at run- time. In fact, a large number of tasks finish earlier than their WCETs [5]. Due to this reason, static approaches cannot significantly save power consumption. On-line approaches such as stretching to NTA, priority-based slack stealing and utilization updating exploit this characteristic by reclaiming dynamic slack times and recompute CPU frequency at task release time to lower power consumption. Dynamic Voltage Scaling (DVS) is one way to reduce power consumption and it basically tries to change power supply voltage dynamically to slow down CPU speed in order to save energy. DVS is based on two concepts and has been a mainstream in the area of power reduction. One concept is the quadratic relationship between power dissipation and supply voltage (p?V 2 ). Another concept of developing DVS is that even if in Hard Real-Time environment, as long as tasks can complete on time, there is no reason to require higher system throughput. 2.1 CMOS Circuit Design Since the hardware makers make ?limitless? number of transistors possible, A. P. Chandrakasan et al. [2] came up with an idea of redesigning system architecture by 1 In this case, actual execution time is less than WCET 4 delaying CMOS circuit behaviors as much as possible while using the lowest poten- tial supply voltage resulting in dynamic power dissipation. This architecture-driven voltage scaling technique is capable of achieving the goal of power savings without regard to its complexity. 2.2 DVS Algorithm with DVS Processor With the help of CMOS circuit technique improvement, more and more variable- voltage microprocessors based on CMOS logic have been brought to the market. Researchers then exploited this trend to develop all kinds of DVS algorithms so as to make use of these available DVS processors. DVS algorithms can be further classified into intra-task DVS (IntraDVS) and inter-task DVS (InterDVS) algorithms according to how they use slack times. In- traDVS passes on slack times from the current task to itself in the next cycle [4, 13]. Usually, IntraDVS determines supply voltage off-line. InterDVS distributes the slack times from the current task to the following tasks. Most currently existing approaches belong to InterDVS [14, 12, 5, 3, 8]. Besides these two, researchers further incorporate the strengths of IntraDVS and InterDVS and comprise a hybrid DVS (HybridDVS) algorithm [7, 3, 8]. First DVS Algorithm was proposed by M. Weiser et al. [17]. This approach is an average throughput-based DVS algorithm and only considers Soft Real-Time envi- ronment. It cannot guarantee Hard Real-Time deadlines will be met. Shin and Choi [14] considered delay overheads from power-down mode and voltage switching, and 5 developed Low Power Fixed Priority Scheduling (LPFPS) with slight modification of Real-Time scheduler, in which they combined off-line and on-line approaches. How- ever, voltage switching incurs delay overhead. Therefore, their solution still needs a trade-off analysis between increased task execution time and power consumption. Similar to our approach, the scheme proposed by Manzak and Chakrabarti [11] also tried to find the operating voltage for the processor while it executes each task. They formulate their equation to minimize total energy consumption subject to a constant total computation time. With the method of Lagrange Multiplier, Manzak and Chakrabarti obtain a relationship among task voltages with the constraint of minimum energy consumption. They further develop an iterative algorithm accord- ingly to adjust task slack times and therefore adjust voltage assignments. For RM scheduling, the iterative voltage assignments start until combined CPU utilization reaches upper bound of RM test. Still, there is a trade off between energy/power savings and the complexity of this algorithms. Stachastic IntraDVS described by F. Gruian [4] not only scaled voltage to fill static slack times but also concerned dynamic slack times caused by run-time task executions and combined on-line slack distribution method to achieve the goal of minimizing energy consumption while meeting all deadlines. Swaminathan and Chakrabarty [15] took into consideration the effect of volt- age switching times on the energy consumption of the task set and presented a novel mixed-integer linear programming model called extended-low-energy earliest- deadline-first (E-LEDF) for the NP-complete scheduling algorithm. Swaninathan 6 and Chakrabarty?s approach is specified for non-preemptive Earliest-Deadline-First (EDF) task set rather than preemptive RM task set. Pillai and Shin [12] used a scaling factor obtained from lowest CPU frequencies in a given task set for all tasks and the results show statically scaled RM tasks cannot reduce energy consumption aggressively. Y. H. Tsai et al. [8] further exploited Pillai and Shin?s static voltage scaling approach and incorporate with their Deferred- Workload-based inter-task DVS (dwDVS). Liu and Mok [10] defined the available cycle function (ACF) and the required cycle function (RCF) to limit CPU speed so that there is no idle cycle when the CPU executes and no job misses its deadline. To solve energy minimization problem, they propose an off-line algorithm and an low complexity on-line dynamic algorithm to reclaim dynamic slack cycles. Liu and Mok?s solution can be used with different Real-Time schedulers. Except that W. Kim et al. [13], A. Manzak and C. Chakrabarti [11] only used off-line approaches, most of researchers combined on-line and off-line approaches to achieve power savings. Shin and Choi [14] combined not only off-line, on-line ap- proaches but also brought the processor to a power-down mode. Hakan et al. [5] included an off-line solution to compute the optimal speed and presented on-line speed adjustment mechanism to reclaim dynamic slack times and further predict task future execution. Similarly, Gruian [4] addressed off-line task stretching to obtain scaling factor and incorporated on-line slack distribution to further utilize dynamic slack times. 7 2.3 Problems Although off-line DVS algorithms have lower complexity and are easy to imple- ment, on-line algorithms can result in even more energy savings; however, due to the difficulty of run-time workload?s prediction as mentioned above, on-line approaches also have higher complexity and higher chances to miss tasks deadlines. In order to compromise with on-line approaches, researchers often speed up tasks to catch up with their deadlines. Yet, accelerating tasks also presents increasing supply voltage from CPU frequency. Therefore, it rarely guarantees the minimum power consumption; instead, it has the risk to cause higher energy. Also, even though some of above studies are for Hard Real-Time environmentand the task set is scheduled with RM algorithm, few of them mentioned if the combined CPU utilization of task set passes the upper bound of RM test. In other words, it could still meet all the deadlines of tasks in run-time but without assurance. 8 Chapter 3 Formalization Since this research is based on RM scheduling algorithm, the system model has the same assumption as Liu and Layland?s [9]. Given a task set of n number of tasks t i , i =1,2,...,n, we assume that the time period T i , i =1,2,...,n, are periodic and each deadline is equal to its period. All the tasks are independent and has its own computation time C i , i =1,2,...,n. 3.1 Rate Monotonic Schedulability Test The combined CPU utilization u for multiprogramming in a Hard Real-Time Embedded Operating System is: u = n summationdisplay i=1 C i T i = C 1 T 1 + C 2 T 2 +???+ C n T n (3.1) where C i is the computation time, T i is the time period and n is the number of tasks in a task set. For Rate Monotonic (RM) Scheduling Algorithm, if u ? n(2 1 n ?1), Liu and Layland [9] indicated that the tasks t i , i =1,2,...,n, will be schedulable and will not miss their deadlines and is called Rate Monotonic schedulability test (RM test). Thus, the upper bound of RM test is n(2 1 n ?1). RM test ?is said to be sufficient but 9 not necessary.? [1] On the other hand, if a task set passes RM test, it will meet all deadlines; if it fails the test, it may or may not fail at run-time. 3.2 Power Consumption The power consumed per CPU cycle is p = ?fNV 2 (3.2) where p is the power dissipation, ? is the average activity factor, f is the clock frequency of processor or the CPU frequency, N is the switching capacitance of wires and transistors gates and V is the supply voltage. From (3.2), the power dissipation p is proportional to CPU frequencyf, switching capacitance N and supply voltage V 2 . Since switching capacitance N is proportional to the numbers of transistors on the chip [16] and all the tasks use the same processor, N is constant. We only consider to reducing CPU frequency f or the supply voltage V here to achieve power reduction. Since V is approximately proportional to CPU frequency f, reducing f also reduces V proportionally [18]. We can say the power dissipation p is proportional to the cube of its CPU frequency f. p?f 3 (3.3) Reducing the CPU frequency f by a factor 1 X i , where X i is a scaling factor and is always greater than or equal to 1, results in increasing the computation time C i to 10 X i C i . Thus we have the new combined CPU utilization equation u prime as follows: u prime = n summationdisplay i=1 X i C i T i = X 1 C 1 T 1 + X 2 C 2 T 2 +???+ X n C n T n ?n(2 1 n ?1) (3.4) 3.3 Energy Consumption The energy consumption E i for executing task t i is: E i ?C i ?p i (3.5) where C i is the computation timeof task t i , p i is the power dissipation of theprocessor. According to (3.3), we could modify energy consumption equation E i as the following form E i ?C i ?f i 3 (3.6) where f i is the CPU frequency. Sincef is reduced to f X i , along with the equivalentscaling X i C i , the scaled energy consumption E prime i becomes E prime i ?X i C i ?( f X i ) 3 = C i ?f 3 X 2 i (3.7) 3.4 Problem Definition Based on the above derivations, we can restate our problem in the following form: 11 Given u = n summationtext i=1 C i T i ?n(2 1 n ?1), our goal is to find ? X := ( ? X 1 ,..., ? X n ) which yields min n summationdisplay i=1 C i ? X 2 i subject to the constraints n summationtext i=1 ? X i C i T i ?n(2 1 n ?1) and 1? ? X i ? T i C i . Associated with our problem are the following parameters: ? u: combined CPU utilization ? n: number of tasks in a task set ? C i : computation time of task t i ? T i : time period of task t i ? X i : scaling factor of task t i 12 Chapter 4 Solution to the optimization problem 4.1 Mathematical Statements and Proofs Based on the above derivations, we restate the problem in the following mathe- matical form: Problem: Find ? X := ( ? X 1 ,..., ? X n ) which yields min n summationdisplay i=1 C i X 2 i , (4.1) where 0 1. By using the identity a n ?1= (a?1)(a n?1 + a n?2 +???+ a + 1), we obtain h(n)?h(n +1) = n(2 1 n ?1)?(n + 1)(2 1 n+1 ?1) = n bracketleftbigg parenleftBig 2 1 n(n+1) parenrightBig n+1 ?1 bracketrightbigg ?(n +1) bracketleftBigparenleftBig 2 1 n(n+1) parenrightBig n ?1 bracketrightBig =(a?1) bracketleftbig n(a n + a n?1 ???+1)?(n + 1)(a n?1 +???+1) bracketrightbig =(a?1)[na n ?(a n?1 +???+ 1)]?0 since a n ?a i for i =1,...,n?1asa>1. So h(n) is strictly monotonic decreasing. By l?Hospital?s rule, lim n?? h(n) = lim n?? 2 1/n ?1 1 n = lim n?? ? 1 n 2 2 1/n ln2 ? 1 n 2 = lim n?? 2 1/n ln2 = ln2. Remark 4.3. When n?2, one can rewrite the set S as S ={x?R n : n summationdisplay i=1 X i C i T i ?n(2 1/n ?1),1?X i }, i.e., one can ignore the condition X i ? T i C i in (4.2) since it follows from summationtext n i=1 X i C i T i ? n(2 1/n ?1) < 1 by Lemma 4.2 and 1?X i . But we keep using (4.2) as the description of S. 15 Each X ?S must satisfy X i < T i C i ,i=1,...,n, (4.3) otherwise X j = T j C j for some j so that by Lemma 4.2 we would have n summationdisplay i=1 X i C i T i ? X j C j T j =1>n(2 1/n ?1), a contradiction. Lemma 4.4. The minimization problem (4.1) only has solution(s) in the compact set R, where R :={x?R n : n summationdisplay i=1 X i C i T i = n(2 1/n ?1),1?X i < T i C i }?S. (4.4) Proof. Set X epsilon1 := X + epsilon1(1,...,1)=(X 1 + epsilon1,...,X n + epsilon1). If the minimum of f(X) occurred at X over the region defined by n summationdisplay i=1 X i C i T i ? X i (?1) for some 1?i0, we have ? X(?)?R. Set ?(?):=f( ? X(?)) = C 1 ? X 2 1 +???+ C i ( ? X i (?)) 2 +???+ C j ( ? X j (?)) 2 +???+ C n ? X 2 n = C 1 ? X 2 1 +???+ C i ( ? X i + ?) 2 +???+ C j ( ? X j ?? C i T j C j T i ) 2 +???+ C n ? X 2 n . Now d?(?) d? = ?2C i ( ? X i + ?) 3 + 2 C i T j T i ( ? X j ?? C i T j C j T i ) 3 = 2C i bracketleftBig ( ? X i + ?) 3 T j T i ?( ? X j ?? C i T j C j T i ) 3 bracketrightBig ( ? X i + ?) 3 ( ? X j ?? C i T j C j T i ) 3 . 18 So ? prime (0) = 2C i bracketleftBig ? X 3 i T j T i ? ? X 3 j bracketrightBig ? X 3 i ? X 3 j < 0, since 1 ? ? X i < ? X j and T i ?T j > 0. This contradicts the assumption that f( ? X)is the minimum. We will first solve a slightly different problem: find the X ?R prime that yields min X?R prime f(X) where R prime ={X ?R n : n summationdisplay i=1 X i C i T i = K}. In other words, we remove the condition 1?X i ? T i C i from the original minimization problem (4.1). Note that R?R prime and the region R prime is represented by the equation n summationdisplay i=1 X i C i T i = K and is a hyperplane in R n . By the method of Lagrange Multiplier, set ?f = ??g which implies ? 2C i X 3 i = ? C i T i . 19 Since C i negationslash=0, X L i = parenleftbigg 2T i ?? parenrightbigg 1/3 . (4.6) Substitute (4.6) into summationtext n i=1 X i C i T i = K to have n summationdisplay j=1 parenleftbigg 2T j ?? parenrightbigg 1/3 C j T j = K. Thus we have 1 (??) 1/3 n summationdisplay j=1 (2T j ) 1/3 C j T j = K. (4.7) So from (4.6) and (4.7) X L i = (2T i ) 1/3 K summationtext n j=1 (2T j ) 1/3 C j T j = T 1/3 i K summationtext n j=1 T 1/3 j C j T j < T i C i (4.8) (since K<1) which is the only critical point of f over R prime . Moreover f(X L )= n summationdisplay i=1 C i (X L i ) 2 = ( summationtext n j=1 T 1/3 j C j T j ) 2 K 2 ( n summationdisplay i=1 C i T 2/3 i ) (4.9) is the local minimum value of f(X)overR prime . Thus it is the global minimum over R prime . If T 1 ?????T n , then from (4.8) X L 1 ?????X L n . 20 However we cannot conclude that 1?X L i for all i =1,...,n, i.e., X L may not be in R since the second constraint of (4.1) is not satisfied. Nevertheless, it is clear that f(X L ) = min X?R prime f(X)?min X?R f(X). (4.10) Theorem 4.6. Let T 1 ?????T n . Let ? X =( ? X 1 ,..., ? X n )?R be a solution to the minimization problem (4.5). 1. X L n ? 1 if and only if ? X n ? 1. In this case, ? X = X L is a solution to the minimization problem (4.1). 2. If X L n < 1, then ? X 1 ????? ? X r?1 > ? X r =???= ? X n = 1 (4.11) for some r =1,...,n. Moreover ( ? X 1 ,..., ? X r?1 ) is a solution to the following minimization problem min r?1 summationdisplay i=1 C i X 2 i + n summationdisplay i=r C i subject to the (new) conditions (a) 1?X i ? T i C i , i =1,...,r?1, (b) summationtext r?1 i=1 C i X i T i ? ? K, where ? K := K? summationtext n i=r C i T i . Namely, ? X i = T 1/3 i ? K summationtext r?1 j=1 T 1/3 j C j T j ,i=1,...,r?1. (4.12) 21 In either case, ? X is the unique solution to the minimization problem (4.5). Proof. By Theorem 4.5 the solution ? X ?R to the minimizationproblem (4.5) satisfies ? X 1 ????? ? X n . (1) By (4.8) X L 1 ?????X L n , X L i < T i C i for all i =1,...,n, and summationtext n i=1 X L i C i T i = K. So X L n ?1 if and only if X L ?R. In this case, it amounts to ? X = X L by (4.10) and it is the unique solution since we only have one critical point for the minimization problem over R prime . (2) If X L n < 1, then X L negationslash? R. Since X L is the only critical point in R prime and R?R prime , we conclude that the minimum point(s) of f over R must be in the boundary ?R of R: ?R:={x?R n : n summationdisplay i=1 X i C i T i = K,X i = 1 for some i or X j = T j C j for some j}. Since X i < T i C i for all i by (4.3), ? X must occur in ? ?R :={x?R n : n summationdisplay i=1 X i C i T i = K,X i = 1 for some i}. In other words, via Theorem 4.5 we have (4.11), i.e., ? X 1 ????? ? X r?1 > ? X r =???= ? X n =1, 22 for some r =1,...,n. Because the region R r :={(X 1 ,...,X r?1 ,1,...,1)?R n : r?1 summationdisplay i=1 C i X i T i + n summationdisplay i=r C i T i = K, 1?X i ? T i C i ,i=1,...,r?1} is a subset of R,( ? X 1 ,..., ? X r?1 ) is the solution to the following minimization problem min r?1 summationdisplay i=1 C i X 2 i + n summationdisplay i=r C i subject to the (new) constraints 1. 1?X i ? T i C i , i =1,...,r?1, 2. summationtext r?1 i=1 C i X i T i ? ? K, where ? K := K? summationtext n i=r C i T i . Since T 1 ?????T r?1 and ? X r?1 > 1, by Theorem 4.6(1) the minimum is attained at ( ? X 1 ,???, ? X r?1 ) which is provided by the method of Lagrange multiplier, i.e., ? X i = T 1/3 i ? K summationtext r?1 j=1 T 1/3 j C j T j > 1,i=1,...,r?1. Moreover the solution is unique once r is fixed. To show that ? X is unique, it is sufficient to show that r is unique. Suppose on the contrary that ( ? X 1 ,..., ? X r?1 ,1,...,1)?R n and ( ? X 1 ,..., ? X r prime ?1 ,1,...,1)?R n both yield the minimum, i.e., f(( ? X 1 ,..., ? X r?1 ,1,...,1)) = f(( ? X 1 ,..., ? X r prime ?1 ,1,...,1)) = min X?R f(X) 23 but rnegationslash= r prime . In other words, ? X i = T 1/3 i ? K summationtext r?1 j=1 T 1/3 j C j T j > 1,i=1,...,r prime ?1. For definiteness, assume rf(( ? X 1 ,..., ? X r prime ?1 ,1,...,1)), a con- tradiction. So r is unique. Hence ( ? X 1 ,..., ? X r?1 ) and ? X are unique. When X L n < 1, to solve Problem 4.1, it suffices to determine r?1 which is the largest index i such that ? X i > 1 and apply (4.12). The following is very useful with respect to the determination of r?1. Proposition 4.7. Let T 1 ?????T n . Let r?1 be the largest integer i such that ? X i > 1(r?1 = 0 means that all ? X?s are 1). Then r?1?s 1 (4.13) 24 where s 1 denotes the largest integer i such that X L i > 1(s 1 = 0 means that all X L ?s are less than or equal to 1), i.e., X L 1 ?????X L s 1 > 1?X L s 1 +1 ?????X L n . Evidently, if X L n ?1, then ? X = X L and r?1=s 1 . Proof. On the contrary suppose that r?1 >s 1 . We would have r?1?s 1 +1so that X L r?1 ?1. From (4.8) X L r?1 = T 1/3 r?1 K summationtext n j=1 T 1/3 j C j T j ?1?? K ? summationtext n j=1 T 1/3 j C j T j T 1/3 r?1 . So K parenleftBigg n summationdisplay j=r T 1/3 j C j T j parenrightBigg ? parenleftBig summationtext n j=1 T 1/3 j C j T j parenrightBig T 1/3 r?1 parenleftBigg n summationdisplay j=r T 1/3 j C j T j parenrightBigg = parenleftBigg n summationdisplay j=r parenleftbigg T j T r?1 parenrightbigg 1/3 C j T j parenrightBiggparenleftBigg n summationdisplay j=1 T 1/3 j C j T j parenrightBigg ? parenleftBigg n summationdisplay j=r C j T j parenrightBiggparenleftBigg n summationdisplay j=1 T 1/3 j C j T j parenrightBigg (4.14) 25 since T 1 ?????T n .Now ? X r?1 X L r?1 = ? ? T 1/3 r?1 K prime summationtext r?1 j=1 T 1/3 j C j T j ? ? ? ? summationtext n j=1 T 1/3 j C j T j T 1/3 r?1 K ? ? = K prime K ? ? summationtext n j=1 T 1/3 j C j T j summationtext r?1 j=1 T 1/3 j C j T j ? ? = bracketleftBigg K? summationtext n j=r C j T j K bracketrightBigg ? ? summationtext n j=1 T 1/3 j C j T j summationtext n j=1 T 1/3 j C j T j ? summationtext n j=r T 1/3 j C j T j ? ? = K parenleftBig summationtext n j=1 T 1/3 j C j T j parenrightBig ? parenleftBig summationtext n j=r C j T j parenrightBigparenleftBig summationtext n j=1 T 1/3 j C j T j parenrightBig K parenleftBig summationtext n j=1 T 1/3 j C j T j parenrightBig ?K parenleftBig summationtext n j=r T 1/3 j C j T j parenrightBig By (4.14) we have ? X r?1 ?X L r?1 ?1, contradicting the fact that ? X r?1 > 1. Proposition 4.7 leads to an algorithm for the determination of r?1 and thus ? X. In general we may not be able to locate r?1 right after the first application of Lagrange Multiplier (see Proposition 4.8). However we can repeat the process and eventually will get to the value of r?1. The following is an algorithm to compute the solution to Problem (4.1). The algorithm: Arrange T 1 ,...,T n so that T 1 ?????T n . Step 1: By the method of Lagrange Multiplier, i.e., from (4.8) X (1) i := X L i = T 1/3 i K summationtext n j=1 T 1/3 j C j T j (4.15) 26 so that X (1) 1 ?????X (1) n by Theorem 4.6. Notice that X (1) i = T 1/3 i K summationtext n j=1 T 1/3 j C j T j < T i C i ,i=1,...,n, (4.16) X (1) 1 ? K summationtext n j=1 parenleftBig T j T 1 parenrightBig 1/3 C j T j ? K summationtext n j=1 C j T j ?1 (4.17) Then consider the cases: (a) if X (1) n ?1, then ? X =(X (1) 1 ,...,X (1) n ) by Theorem 4.6. (b) if X (1) n < 1, i.e., there is 0 ? s 1 ? n?1 such that X (1) 1 ?????X (1) s 1 > 1 ? X (1) s 1 +1 ?????X (1) n then by (4.13) r?1?s 1 , where ? X 1 ????? ? X r?1 > 1= ? X r = ? X r+1 =???= ? X n and r?1 is obviously fixed and to be determined. In other words, we know that ? X s 1 +1 =???= ? X n =1. Remark: s 0 = 0 means that X L i ?1 for all i =1,...,n. But it actually means that X L i = 1 for all i because g(X L )=K. Then the problem is reduced to the following: Find X (2) 1 ?????X (2) s 1 ?1(=X (2) s 1 +1 =???= X (2) n ), which yields min s 1 summationdisplay i=1 C i X 2 i + n summationdisplay i=s 1 +1 C i , or simply min s 1 summationdisplay i=1 C i X 2 i subject to the constraints 27 1. summationtext s 1 i=1 X i C i T i = K (2) , where K (2) := K? n summationdisplay i=s 1 +1 C i T i and obviously summationtext s 1 i=1 C i T i ?K (2) (we may set K (1) := K in Step 1). 2. 1?X i ? T i C i , i =1,...,s 1 . Step 2: Apply the method of Lagrange Multiplier to yield X (2) i = T 1/3 i K (2) summationtext s 1 j=1 T 1/3 j C j T j ,i=1,...,s 1 (4.18) so that X (2) 1 ?????X (2) s 1 . Then consider the cases: (a) if X (2) s 1 ?1, then ? X =(X (2) 1 ,...,X (2) s 1 ,1,...,1). (b) if X (2) s 1 < 1, i.e., there is 0 ? s 2 ? s 1 ?1 such that X (2) 1 ?????X (2) s 2 > 1 ? X (2) s 2 +1 ?????X (2) s 1 , then the problem is reduced to the following problem: Find X (3) 1 ?????X (3) s 2 ?1(=X (3) s 2 +1 =???= X (3) s 1 =???= X (3) n ) with s 2 1), X (lscript+1) i < X (lscript) i . In other words, each needed iteration will decrease X?s which are larger than 1. Example 4.9. For instance, take a look at the following experimental example. Suppose T := (T 1 ,T 2 ,T 3 ,T 4 ) = (25391,14905,12913,5758) and C := (C 1 ,C 2 ,C 3 ,C 4 ) = (4616,6073,575,515). 29 Notice that T 1 ?T 2 ?T 3 ?T 4 . ith iteration X (i) 1 X (i) 2 X (i) 3 X (i) 4 1 1.23 1.03 0.99 0.75 2 1.19 0.99 1 1 3 1.18 1 1 1 So three iterations are needed to get ? X =(1.18,1,1,1) and r?1=1. Remark 4.10. In case Theorem 4.6(2) occurs, Y := (X L 1 ,...,X L s 1 ,1,...,1)negationslash?R since summationtext n i=1 Y i C i T i > summationtext n i=1 X L i C i T i = K and because of (4.4). So Y is not a solution to the minimization problem (4.1). Thus there is no contradiction though f(Y ) < f(X L )(?min X?R f(X)). Similar conclusion can be reached for consecutive iterations with respect to Proposition 4.8. The following is another description from a top-down view. Similar algorithm can be derived from it. Theorem 4.11. Let T 1 ?????T n . Suppose that the minimization problem (4.5) has solution ? X =( ? X 1 ,..., ? X n )?R. 1. X L n ?1 if and only if ? X n ?1. In this case, ? X = X L is the unique solution to the minimization problem (4.1). 30 2. If X L n < 1, then the unique solution ? X 1 ????? ? X r?1 > ? X r =???= ? X n = 1 has r where r?1 is the first integer i such that T 1/3 i K (i) summationtext i j=1 T 1/3 j C j T j > 1 and T 1/3 i+1 K (i+1) summationtext i+1 j=1 T 1/3 j C j T j ?1, where K (k) := K? summationtext n j=k+1 C j T j . To illustrate, let us consider the special case n =2. 1. n = 2: We first arrange T?s so that T 1 ?T 2 . The Lagrange Multiplier yields X L i = T 1/3 i K summationtext 2 j=1 T 1/3 j C j T j , i =1,2. Notice that X L 1 = T 1/3 1 K T 1/3 1 C 1 T 1 + T 1/3 2 C 2 T 2 = K C 1 T 1 +( T 2 T 1 ) 1/3 C 2 T 2 ? KT 1 C 1 ? T 1 C 1 , (4.19) X L 2 = T 1/3 2 K T 1/3 1 C 1 T 1 + T 1/3 2 C 2 T 2 = K ( T 1 T 2 ) 1/3 C 1 T 1 + C 2 T 2 ? KT 2 C 2 ? T 2 C 2 (4.20) X L 1 = K C 1 T 1 +( T 2 T 1 ) 1/3 C 2 T 2 ? K C 1 T 1 + C 2 T 2 ?1. (4.21) by K ? C 1 T 1 + C 2 T 2 . Moreover X L 1 ?X L 2 (4.22) because T 1 ?T 2 . So it remains to check whether X L 2 ?1. 31 We take another approach and treat f as a function of X 2 . Since C 1 X 1 T 1 + C 2 X 2 T 2 = K, X 2 =(K? C 1 X 1 T 1 ) T 2 C 2 so that f(X)=f(X 2 )= C 1 [(K? C 2 X 2 T 2 ) T 1 C 1 ] 2 + C 2 X 2 2 . There are two singularities on R, namely X 2 = 0, i.e., (X 1 ,X 2 )=( KT 1 C 1 ,0), and X 2 = KT 2 C 2 , i.e, (X 1 ,X 2 )=(0, KT 2 C 2 ). Notice that [1,(K? C 1 T 1 ) T 2 C 2 ] is the range for X 2 , according to the restrictions. Consider f(X 2 ): 1. On the open interval (??,0), f is decreasing from?to 0. 2. On the open interval (0, KT 2 C 2 ), f has a minimum at X L 2 = K ( T 1 T 2 ) 1/3 C 1 T 1 + C 2 T 2 ? KT 2 C 2 ? T 2 C 2 i.e., (X L 1 ,X L 2 )=( K C 1 T 1 +( T 2 T 1 ) 1/3 C 2 T 2 , K ( T 1 T 2 ) 1/3 C 1 T 1 + C 2 T 2 ) by the method of Lagrange Multiplier. 3. On the open interval ( KT 2 C 2 ,?), f is decreasing from?to 0. Notice that the interval [1,(K? C 1 T 1 ) T 2 C 2 ] is a subset of the open interval (0, KT 2 C 2 ). Recall 1 ? X L 1 ? T 1 C 1 , X L 2 ? T 2 C 2 , C 1 X 1 T 1 + C 2 X 2 T 2 = K, and X L 1 ? X L 2 . So we only have the following two cases: 32 Case 1: X L 2 ?1. Then (X L 1 ,X L 2 ) provides the min. Note: Since C 1 X L 1 T 1 + C 2 X L 2 T 2 = K, X L 1 ?1 implies C 2 X L 2 T 2 = K? C 1 X L 1 T 1 ?K? C 1 T 1 so that X L 2 ?(K? C 1 T 1 ) T 2 C 2 . Similarly if X L 2 ?1, then X L 1 ?(K? C 2 T 2 ) T 1 C 1 . Case 2: X L 2 < 1. The minimum must be obtained on the boundary ?R.Now d dX 2 f(X 2 )= 2C 1 C 2 T 2 (K? C 2 X 2 T 2 ) 3 ( T 1 C 1 ) 2 ? 2C 2 X 3 2 = 2C 2 (K? C 2 X 2 T 2 ) 3 X 3 2 bracketleftbigg C 3 1 T 2 1 T 2 ?(K? C 2 X 2 T 2 ) 3 bracketrightbigg Recall that the range for X 2 is [1,(K? C 1 T 1 ) T 2 C 2 ]. For 1?X 2 < T 2 C 2 , since X L 2 < 1, i.e., C 1 T 1 parenleftbigg T 1 T 2 parenrightbigg 1/3 + C 2 T 2 >K so that C 1 T 1 parenleftbigg T 1 T 2 parenrightbigg 1/3 + C 2 X 2 T 2 ? C 1 T 1 parenleftbigg T 1 T 2 parenrightbigg 1/3 + C 2 T 2 >K. Hence C 3 1 T 2 1 T 2 ?(K? C 2 X 2 T 2 ) 3 > 0 33 so that d dX 2 f(X 2 ) > 0. In other words, f is increasing on the interval. Hence the minimum occurs at X 2 =1. The following is a typical picture for n = 2 case. ?1 0 1 2 3 4 5 0 50 100 150 200 250 300 350 400 450 500 x2 f(x2) C1 = 9 T1 = 19; C2 = 8 T2 = 28; (k ? (C1/T1)*T2)/C2 = 1.2416 Figure 4.1: Plot of f(X 2 ) 34 4.2 Scaling Factor Assignment Algorithm On the basis of above proofs, we restate our solution as follows: 1. Sort time period T i in descending order. 2. Compute scaling factor X i by the method of Lagrange multiplier (X 1 ????? X n ). If all of them are greater than or equal to 1, then they are the solution. Then we are done. Otherwise, there are some X?s less than 1. 3. Then find those X i less than or equal to 1 and set them to 1, i.e., X r =1, r = i,i+1,...,n. 4. Repeat 2 and 3 until all X i ?1. Table 4.1 is the ScaleFactors pseudocode according to the above algorithm. 35 1 Procedure ScaleFactors 2 Input: task set t ={}of n elements 3 Output: Scale factor set{X i }of n elements 4 Sort t in descending T i 5 K ?n(2 1/n ?1) 6 r?n +1 7 repeat 8 denom?0,K prime ?K 9 for j?0tor?1 do denom?denom + T 1/3 j ?C j /T j 10 for q?r to n do K prime ?K prime ?C q /T q 11 trunc?false 12 for i?1tor?1 do 13 X i ?T 1/3 i ?K prime /denom 14 if (X i ?1) then 15 r?i+1,trunc?true 16 break 17 end if 18 end for 19 until (not trunc)or(r =1) 20 for i?r to n do X i ?1 21 end ScaleFactors Table 4.1: Procedure ScaleFactors Pseudocode Let us analyze the ScaleFactors algorithm. ? Sorting task set t of n elements can take O(nlgn) time if we use Quicksort algorithm. ? Initializing variable K and r at the beginning takes O(1) time, respectively. 36 ? There are three nested loops. The body of the first and second inner loops, controlled by counter j and q, are executed together n times, for j =0,...,r?1 and q = r,...,n; that is, O(n) times. ? Similarly, the body of the third inner loop, which is controlled by counter i,is executed r?1 times, depending on the current value of the variable r. Since r is initialized with n + 1, and is never greater than n + 1, which implies that in each iteration, the body of the for loop executes n+1 times at most. Therefore, the statements in the second inner loop also takes O(n) times. ? The outer loop is not terminated until the variable trunc is false or r is set to be 1. The best case occurs when the variable trunc never be assigned true or r = 1, which takes O(1) time. However, the worst case occurs when r = n?1 in the prior iteration each time and takes O(n) times. In sum, the running time of ScaleFactors algorithm is given O(nlgn) + O(1) + O(n 2 ) = O(n 2 ). 37 However, since our goal is to find ( ? X 1 , ? X 2 ,..., ? X r?1 ) for some r =1,...,n,ifwe can get the first X less than 1 sooner in each iteration, we can improve the complexity of the algorithm. Using divide-and-conquer paradigm in Table 4.2 to arrive at ? X r?1 takes O(nlgn) run-time and the complexity will be reduced to O(nlgn). 1 Procedure ScaleFactorsDnC 2 Input: task set t ={}of n elements 3 Output: Scale factor set{X i }of n elements 4 Sort t in descending T i 5 K ?n(2 1/n ?1) 6 first?0,last?n 7 while (first <= last) do 8 r?last 9 mid?(first + last)/2 10 denom?0,K prime ?K 11 for j?0tor?1 do denom?denom + T 1/3 j ?C j /T j 12 for q?r to n do K prime ?K prime ?C q /T q 13 X mid ?T 1/3 mid ?K prime /denom 14 if (X mid > 1) then 15 first?mid +1 16 else 17 last?mid?1 18 end if 19 end while 20 for i?1tor?1 do X i ?T 1/3 i ?K prime /denom 21 for i?r to n do X i ?1 22 end ScaleFactorsDnC Table 4.2: Procedure ScaleFactorsDnC Pseudocode 38 4.3 Revised Scaling Factor Assignment Algorithm If we look back into the mathematical form of our solution from (4.11), we would find out that ? X i > 1 only when i?r?1. In other words, ? X i ?1 when i>r?1 according to Proposition 4.7. Therefore, we can also get ? X r?1 more efficiently if we compute X?s backwards. We adjust above algorithm and list its pseudocode in Table 4.3. 1 Procedure ScaleFactorsBackward 2 Input: task set t = of n elements 3 Output: Scale factor set X i of n elements 4 Sort time periods in descending order. 5 K ?n(2 1/n ?1),denom?0 6 for i?1ton do 7 denom?denom + T 1/3 i ?C i /T i 8 end for 9 for i?n down to 1 do 10 X i = T 1/3 i ?K/denom 11 if X i ?1 then 12 X i ?1 13 K ?K?C i /T i 14 denom?denom?T 1/3 i ?C i /T i 15 r?i 16 end if 17 end for 18 end ScaleFactorsBackward Table 4.3: Procedure ScaleFactorsBackward Pseudocode ? As well, sorting task set t of n elements can take O(nlgn) time if we use Quicksort algorithm. 39 ? Initializing variables K and denom takes O(1) time. ? The body of the first inner loop, controlled by counter i, is executed n times, for i =0,...,n?1, that is, O(n) times. ? Similarly, the body of the second inner loop, which is controlled by counter i,is executed n times, for i = n?1,...,0. Therefore, the statements in the second inner loop takes O(n) times. The ScaleFactorsBackwards then takes O(nlgn) + O(1) + O(n) + O(n) = O(nlgn) times. If we only consider the complexity of computing ? X i , the ScaleFactorsBackwards takes only O(n) times to arrive at ? X r , yet the ScaleFactors takes O(n 2 ) and Scale- FactorsDnC takes O(nlgn) times. Therefore, the ScaleFactorsBackwards algorithm is proven to have lowest complexity than the other two. 40 Chapter 5 Result and Analysis Consider the following task set: Task Computation Time, C Time Period, T Utilization, U Priority a 3 8 0.38 3 b 3 10 0.30 2 c 1 14 0.07 1 Table 5.1: Example Task Set A Table 5.1 contains three tasks that are given priorities through RM algorithm. Computation time and time period are measured in the number of clock cycles per second. Note that priority is presented in integer in descending order. Namely, the higher the integer, the greater the priority. Task a is given the highest priority since it has the shortest time period. Their combined CPU utilization is 0.75 and passes RM test, which equals 0.78 from (3.1). 41 Figure 5.1 is the time-line for task set A before scaling. Y -coordinate represents clock frequency of the processor, and x-coordinate means task computation time. a 2 is the deadline of task a; likewise, b 2 , c 2 are the deadlines of tasks b and c. Notice that all the three tasks finish their execution before their deadlines. Figure 5.1: Time-line for Task Set A before scaling, where computation times are specified at the maximum CPU frequency After scaling all tasks computation time according to each corresponding scaling factor from our solution, table 5.2 and table 5.3 show the results of energy reduction. Task Computation Time, C Utilization, U CPU Frequency Energy a 3 0.38 1.0 b 3 0.30 1.0 7 c 1 0.07 1.0 Table 5.2: Energy Consumption before Scaling for Task Set A We suppose the maximum CPU frequency is 1.0. Energy in Tables 5.2 and 5.3 are calculated from (4.1)?CPU frequency. 42 Task Computation Time, C Utilization, U CPU Frequency, f Minimum Energy a 3 0.38 1 b 3.21 0.32 0.94 6.35 c 1.19 0.08 0.84 Table 5.3: Energy Consumption after Scaling for Task Set A As we see in Table 5.3, the CPU frequency and computation time of each task change individually. For instance, CPU frequency and computation time of task a remain the same after scaling, yet in task b, CPU frequency is slightly reduced to 0.94 and its computation time extends from 3 till 3.21. The CPU frequency of task c is reduced even more. This phenomenon just illustrates the algorithm suggested in Chapter 4, the longer the time period of a task, the greater its scaling factor. The combined CPU utilization after scaling just reaches 0.78 (the upper bound of RM test), but the scaled task set saves 17.79% energy. 43 The scaled time-line is depicted in Figure 5.2. Note that not only both CPU frequencies of tasks b and c are reduced, but slack time is also shortened. However, each task still complete its execution before deadline. Therefore, we prove that the task set can still be scheduled by RM algorithm. Figure 5.2: Scaled Time-line for Task Set A Consider another case: Task Computation Time, C Time Period, T Utilization, U Priority a 2 14 0.14 1 b 1 10 0.10 3 c 3 12 0.25 2 Table 5.4: Example Task Set B Task b in table 5.4 is given the highest priority and task a the lowest. Their combined CPU utilization is 0.49, much less than 0.78. The time-lime scheduled by RM algorithm for task set B is as follows: 44 Figure 5.3: Time-line for Task Set B The slack time in task set B is much longer than in task set A, since its combined CPU utilization is much less than in task set A. Task b has the greatest priority and executes first. Again, we assume that every task executes at the maximum CPU frequency 1.0. Table 5.5 and Table 5.6 show the result of power reduction after scaling. The scaled task set saves almost 60.22 % energy. Task Computation Time, C Utilization, U CPU Frequency, f Energy a 2 0.14 1.0 b 1 0.10 1.0 6 c 3 0.25 1.0 Table 5.5: Energy Consumption before Scaling for Task Set B 45 Task Computation Time, C Utilization, U CPU Frequency, f Minimum Energy a 3.32 0.24 0.60 b 1.58 0.16 0.67 2.39 c 4.45 0.37 0.63 Table 5.6: Energy Consumption after Scaling for Task Set B Figure 5.4: Scaled Time-line for Task Set B The time-line for task set B after CPU frequency scaling is in Figure 5.4. The results show that the percentage of power saving varies dramatically because this approach is independent of task sets. In general, the more difference between CPU combinedutilizationand the upper bound of RM test, the more power reduction. We should also consider the effect of the overhead of voltage switching because voltage switching results in overhead in that CPU cannot execute any task [15]. We could further incorporate our approach with different dynamic reclaiming algorithms. We will extend this work in the real life examples as our future work. 46 Chapter 6 Summary In this thesis, we derive a minimum energy function for RM task sets and solve this problem. Different from other approaches, our solution goes straightforward and takes account of RM test united with power dissipation equation in CMOS circuits. By using the method of Lagrange Multiplier, we are able to abtain each scaling factor for corresponding task execution time. We assume the processor can vary its supply voltage dynamically and ignore the threshold voltage and voltage switching overhead. We also assume that every slack time can be fully filled by extending each task?s WCET according to their specified scaling factors. For a task set containing n tasks, the scaling factors ? X i can be obtained by iteratively using the following formula to test whether the scaling factor ? X i is less than or equal to 1, ? X i = T 1/3 i K prime summationtext r?1 j=1 T 1/3 j C j T j ,i=1,...,r?1, for some r =1,...,n K prime := n(2 1/n ?1)? n summationdisplay i=r C i T i . 47 The experiment results show that minimum power consumption can be exactly accomplished in the presence of all ? X i greater than or equal to 1. At the same time, the RM task set is still guaranteed to meet all deadlines. Our solution is proven not to violate tasks deadlines. 48 Bibliography [1] A. Burns and A. Wellings, Real Time Systems and Programming Languages: Ada 95, Real-Time Java and Real-Time C/POSIX, 3rd ed., Addison Wesley, 2001. [2] A. P. Chandrakasan, S. Sheng and R. W. Brodersen. ?Low-Power CMOS Digital Design,? IEEE Journal of Solid-State Circuits, vol. 35, pp. 473-484, Apr. 1992. [3] A. Chilambuchelvan, S. Saravanan and J. R. P. Perinbam. ?A Simulation Soft- ware Development for Performance Analysis of DVS Algorithm for Low Power Embedded System,? ARPN Journal of Engineering and Applied Sciences, vol. 2, pp. 27-31, Aug. 2007. [4] F. Gruian, ?Hard Real-Time Scheduling for Low-Energy Using Stochastic Data and DVS Processors,? In Proc. the 2001 International Symposium on Low Power Electronics and Design, 2001, pp. 46-51. [5] A. Hakan, R. Melhern, D. Mosse, P. M. Alvarez, ?Dynamic and Aggressive Scheduling Techniques for Power-Aware Real-Time Systems,? Real-Time Sys- tems, pp. 225-232, Jun. 2001. [6] R.A. Horn and C.R. Johnson, Matrix Analysis, Cambridge University Press, 1985. [7] W. Kim, D. Shin, H. S. Yun, J. Kim, and S. L. Min, ?Performance Comparison of Dynamic Voltage Scaling Algorithms for Hard Real-Time Systems,? In Proc. the 8th IEEE Real-Time and Embedded Technology and Applications Symposium, 2002, pp. 219-218. [8] Y. H. Tsai, K. Wang, and J. M. Chen, ?A Deferred-Workload-based Inter-Task Dynamic Voltage Scaling Algorithm for Portable Multimedia Devices,? In Proc. the 2007 international conference on Wireless communications and mobile com- puting, 2007, pp. 677-682. [9] C. L. Liu and J. W. Layland, ?Scheduling Algorithms for Multiprogramming in a Hard Real-Time Environment,? Journal of the ACM (JACM), vol. 20, pp. 46-61, Jan. 1973. 49 [10] Y. Liu and A. K. Mok, ?An Integrated Approach for Applying Dynamic Voltage Scaling to Hard Real-Time Systems,? In Proc. the 9th IEEE Real-Time and Embedded Technology and Applications Symposium, 2003, pp. 116-123. [11] A. Manzak and C. Chakrabarti, ?Variable Voltage Task Scheduling Algorithms for Minimizing Energy/Power,? In Proc. the 2001 International Symposium on Low Power Electronics and Design, 2001, pp.279-282. [12] P. Pillai and K. G. Shin, ?Real-Time Dynamic Voltage Scaling for Low-Power Embedded Operating System,? In Proc. the 18th ACM Symposium on Operating Systems Principles, 2001, pp. 89-102. [13] D. Shin, J. Kim and S. Lee. ?Intra-Task Voltage Scheduling for Low-Energy Hard Real-Time Applications,? IEEE Design and Test of Computers, vol. 18, pp. 20-30, Mar. 2001 [14] Y. Shin and K. Choi, ?Power Conscious Fixed Priority Scheduling for Hard Real-Time Systems,? In Proc. the 36th ACM/IEEE Conference on Design Au- tomation, 1999, pp. 134-139. [15] V. Swaminathan and K. Chakrabarty, ?Investigating the Effect of Voltage- Switching on Low-Energy Task Schedulingin Hard Real-Time Systems,? In Proc. the 2001 Conference on Asia South Pacific Design Automation, 2001, pp. 251 - 254. [16] O. S. Unsal and I. Koren, ?System-Level Power-Aware Design Techniques in Real-Time Systems,? In Proc. the IEEE, 2003, pp. 1055 - 1069. [17] M. Weiser, B. Welch, A. Demers, and S. Shenker, ?Scheduling for Reduced CPU Energy,? In Proc. the 1st USENIX conference on Operating Systems Design and Implementation, 1994, pp. 13-23. [18] X. L. Zhong and C. Z. Xu, ?Energy-Aware Modeling and Scheduling of Real- Time Tasks for Dynamic Voltage Scaling,? In Proc. the 26th IEEE International Real-Time Systems Symposium, 2005. 50 Appendices 1 function y = MinEnergy(T,C) ... 9 [T,NDX] = sort(T,?descend?); % sort T in descending order 10 C = C(NDX); % sort C accordingly but not necessarily in descending order 11 unsort(NDX) = 1:n; % see http://blogs.mathworks.com/loren/?p=104 12 % inverse sort 13 K=n?(2 ? (1/n)?1); % upper bound of RM 14 C1 = C; 15 T1 = T; 16 i = 1; 17 while i <=n 18 D = sum((T1. ? (1/3)).*(C1./T1)); % Denominator 19 X(i) = (T1(i)) ? (1/3)*K/D; 20 if X(i) <=1 21 X(i) = 1; 22 K=K?sum((C1(i:n))./(T1(i:n))); 23 T1 = T1(1:i-1); C1 = C1(1:i-1); 25 n=i-1;i=1; 27 else 28 i=i+1; 29 end 30 end 31 unsort; 32 X=X(unsort) % print the real X 33 end 34 end 51