Energy Usage Pro ling and Topology-Based Scheduling for Clusters
Type of DegreePhD Dissertation
Computer Science and Software Engineering
MetadataShow full item record
Energy saving is rapidly becoming one of the hottest topics in technology field within recent decades. With the development of technology, it brings a sheer increasing trend of data and the growth scale of clusters and data centers. Meanwhile, it also raises another essential issue into the path: energy cost. In the first part of dissertation, we are diving into this key issue and evaluating energy-efficiency based on TPC-W benchmark: a notable web transaction e-commerce benchmark. We simulate the web transaction with different database sizes and collect the energy data by KILL-A-WATT. Also, we deploy this setup on four different cluster systems: PC nodes and wimpy nodes, and two different heterogeneous systems: using PC as front server and wimpy as Database server, and using wimpy as Web server and PC as Database server. Energy result demonstrates different characteristics among them, which can give lightening advice for future works in data center. In the second part of this dissertation, we propose a novel scheduler for Apache Storm, topology-based scheduler(TOSS for short). Nowadays, our world is undergoing profound challenges in processing a massive amount of data. A handful of computation technologies emerge as a promising computation platform for data intensive processing. Apache Storm is an outstanding open-source platform for large-scale streaming computation, which is widely used in the industry (e.g., Twitter). Performance bottleneck problems encountered in streaming data applications motivate us to investigate scheduling issues in Storm. A key aspect of tuning Storm performance is to decide how to deploy components of a storm application among all available nodes in a cluster. Driven by our observations, we design and implement a new scheduling strategy called TOSS based on application structures. Compared to the existing round-robin scheduler, TOSS not only judiciously handles tight-bind components, but also balances workloads by introducing a self-tuning mechanism in the deployment stage. We conduct experiments by applying two popular and distinct topologies to evaluate the performance of TOSS. The experimental results suggest that TOSS significantly boost the performance of the round-robin scheduler. In particular, TOSS substantially improves the system throughput of Storm while shortens latency of Storm applications.