Design and Implementation of Scalable and Efficient Programming Models for Fast Computation and Data Processing

Que, Xinyu

Metadata Field	Value	Language
dc.contributor.advisor	Yu, Weikuan
dc.contributor.author	Que, Xinyu
dc.date.accessioned	2013-05-14T19:06:09Z
dc.date.available	2013-05-14T19:06:09Z
dc.date.issued	2013-05-14
dc.identifier.uri	http://hdl.handle.net/10415/3645
dc.description.abstract	Despite the tremendous growth of computational power, scientific applications and business data analytics continue to face many challenges such as programming productivity, application scalability, and efficiency. Recently, Global Address Space (GAS) or Partitioned Global Address Space (PGAS) programming models are emerging as because of their ability to alleviate programming burden by supporting data access to both local and remote memory through a simple shared-memory addressing model. Meanwhile, with the exponential growth of the digital universe, the MapReduce programming model becomes popular for data analytics because of its ease of use, low cost on commodity hardware, fault tolerance, and programming flexibility. Furthermore, with social media data gets bigger, relationships inside social media data get complex and have normally been modeled as massive graphs, which require scalable algorithms to analyze the real-world graphs for data processing. This dissertation investigates the research challenges in those directions and contributes efficient and scalable programming models for fast computation and data processing. It first focuses on addressing the critical challenges faced by the underlying runtime systems of GAS model on petascale systems. In particular, I have proposed and designed a Hierarchical Cooperation (HiCOO) supporting scalable communication for GAS programming models, which is able to realize scalable resource management and achieve resilience to network contention while at the same time maintaining or enhancing the performance of scientific applications. The second study is to address the performance challenge in the existing MapReduce programming model. I have revealed a number of issues faced by the current MapReduce Programming mode and proposed a novel virtual shuffling strategy to enable efficient data movement for MapReduce data shuffling phase, which is able to significantly reduce disk I/O accesses and results in performance improvement and power consumption saving. The third study is on large-scale graph processing. I have designed and implemented a parallel community detection algorithm over distributed memory system. It can perform community analysis in real-time for massive graphs.	en_US
dc.rights	EMBARGO_NOT_AUBURN	en_US
dc.subject	Computer Science	en_US
dc.title	Design and Implementation of Scalable and Efficient Programming Models for Fast Computation and Data Processing	en_US
dc.type	dissertation	en_US
dc.embargo.length	NO_RESTRICTION	en_US
dc.embargo.status	NOT_EMBARGOED	en_US

Files in this item

Name:: ausample.pdf
Size:: 3.519Mb

Show simple item record