This Is AuburnElectronic Theses and Dissertations

Browsing by Author "Yu, Weikuan"

Now showing items 1-16 of 16

Accelerate MapReduce’s Failure Recovery through Timely Detection and Work-conserving Logging 

Fu, Huansong (2015-09-08)
MapReduce has become an indispensable part of the increasing market for big data analytics. As the representative implementation of MapReduce, Hadoop/YARN strives to provide outstanding performance in terms of job turnaround ...

Analyzing the Benefits of Graphics Processing Units for Computation-Intensive Applications on Hadoop 

Vasko, Kevin (2015-07-21)
Due to the ever expanding amount of data that is being generated in the “Big Data era” there is an ever increasing challenge of processing this data. This work aims to tackle the challenge of improving the performance of ...

Analyzing the E ects of Sequencer Discrepancies on Next-Generation Genome Assembly Tools 

Pritchard, Michael Jr (2016-08-04)
The advent of Next-Generation Sequencing (NGS) techniques in the early 21st century massively increased genetic sequencing throughput while dramatically reducing associated costs. This is turn lowered barriers of entry ...

Assessment of Multiple Ingest Strategies for Accumulo Key-Value Store 

Pham, Hai (2016-05-05)
In recent years, the emergence of heterogeneous data, especially of the unstructured type, has been extremely rapid. The data growth happens concurrently in 3 dimensions: volume (size), velocity (growth rate) and variety ...

Assessment of Multiple MapReduce Strategies for Fast Analytics of Small Files 

Zhou, Fang (2015-05-05)
Hadoop, an open-source implementation of MapReduce, is used widely because of its ease of programming, scalability, and availability. Hadoop distributed file system (HDFS) and Hadoop MapReduce are two important components ...

Design and Implementation of Scalable and Efficient Programming Models for Fast Computation and Data Processing 

Que, Xinyu (2013-05-14)
Despite the tremendous growth of computational power, scientific applications and business data analytics continue to face many challenges such as programming productivity, application scalability, and efficiency. ...

Design of a Hybrid Memory System for General-Purpose Graphics Processing Units 

Carpenter, Patrick (2012-04-27)
Addressing a limited power budget is a prerequisite for maintaining the growth of computer system performance into and beyond the exascale. Two technologies with the potential to help solve this problem include general-purpose ...

Efficient Movement and Task Management in MapReduce for Fast Analytics of Big Data 

Wang, Yandong (2014-05-12)
MapReduce programming model has achieved great success over the past decade. With its recognized merits such as superior scalability and strong fault tolerance, MapReduce has thrived as a primary processing engine adopted ...

Efficient Storage Design and Query Scheduling for Improving Big Data Retrieval and Analytics 

Liu, Zhuo (2015-05-04)
With the perpetually increasing requirement and generation of digital data, the human being has been stepping into the Big Data era. To efficiently manage, retrieve and exploit such gigantic amount of data continuously ...

Feature Enhancement and Performance Evaluation of BioPig Analytics 

Shi, Lizhen (2016-01-27)
Next-Generation sequencing produces huge collections of strings to be analyzed. This massive dataset challenges traditional analytics tools and increasingly requires novel solutions adapting to big data platforms. MapReduce ...

HadioFS: Improve the Performance of HDFS by Off-loading I/O to ADIOS 

Li, Xiaobing (2013-08-21)
Hadoop Distributed File System (HDFS) is the underlying storage for the whole Hadoop stack, which includes MapReduce, HBase, Hive, Pig, etc. Because of its robustness and portability, HDFS has been widely adopted, often ...

Mitigating GPU Memory Divergence for Data-Intensive Applications 

Wang, Bin (2015-07-21)
Graphics Processing Units (GPUs) have proven as a viable technology for a wide variety of general purpose applications to exploit the massive computing capability and high computation efficiency. In GPUs, threads are ...

Scalable Collective Communication and Data Transfer for High-Performance Computation and Data Analytics 

Xu, Cong (2015-04-27)
Large-scale computer clusters are leveraged to solve various crucial problems in both High Performance Computing and Cloud Computing fields. Message Passing Interface (MPI) and MapReduce are two prevalent tools to tap the ...

Secondary Bus Performance in Reducing Cache Writeback Latency 

Venkatesh, Rakshith (2011-04-08)
For single as well as multi core designs, effective strategies to minimize cache access latencies have been proposed by a number of researchers over the last decade. Such designs include the Miss Status Holding Registers, ...

Taming the Scientific Big Data with Flexible Organizations for Exascale Computing 

Tian, Yuan (2012-07-31)
The last five years of supercomputers has evolved at an unprecedented rate as High Performance Computing (HPC) continue to progress towards exascale computing in 2018. These systems enable scientists to simulate scientific ...

TCP/IP Implementation of Hadoop Acceleration 

Xu, Cong (2012-05-18)
Cloud Computing is a booming technology in computer science. Since Google released the design details of the MapReduce technique in 2004 [1], cloud computing has been more and more popular. Hadoop [2] has been developed ...