TCP/IP Implementation of Hadoop Acceleration

Xu, Cong

Metadata Field	Value	Language
dc.contributor.advisor	Yu, Weikuan
dc.contributor.author	Xu, Cong
dc.date.accessioned	2012-05-18T13:44:02Z
dc.date.available	2012-05-18T13:44:02Z
dc.date.issued	2012-05-18
dc.identifier.uri	http://hdl.handle.net/10415/3167
dc.description.abstract	Cloud Computing is a booming technology in computer science. Since Google released the design details of the MapReduce technique in 2004 [1], cloud computing has been more and more popular. Hadoop [2] has been developed as an open-source implementation of MapReduce. A new network-levitated merge mechanism (Hadoop-A) [3] improves the existing Hadoop framework to solve many problems in the original framework. Hadoop-A avoids repetitive merging of data and introduces a full pipeline that consists of shuffle, merge and reduce phases. However, Hadoop-A is implemented based on Infiniband RDMA technology, which is not commonly deployed on commercial servers. On the other hand, data transmission based on the TCP/IP protocol is a robust technology, its speed is becoming faster and faster. Thus, we deem that it worthwhile to complement our RDMA-based connection with an implementation that is built on TCP/IP protocol. In this article, I will describe the details of design and implementation of a TCP/IP Implementation of Hadoop-A. Two components MOFSupplier (Server) and NetMerger (Client) are introduced to realize the TCP/IP connection, which can fetch data from Maptasks and send them to Reducetasks within the new network-levitated merge mechanism. Multithreading technologies are used to manage memory pool, send/receive and merge data segments. The experiment results show that the TCP/IP implementation can bring good performance for Hadoop-A on TCP/IP. Its execution time outperforms original Hadoop by 26.7% and can also achieve good scalability.	en_US
dc.rights	EMBARGO_NOT_AUBURN	en_US
dc.subject	Computer Science	en_US
dc.title	TCP/IP Implementation of Hadoop Acceleration	en_US
dc.type	thesis	en_US
dc.embargo.length	NO_RESTRICTION	en_US
dc.embargo.status	NOT_EMBARGOED	en_US

Files in this item

Name:: thesis.pdf.txt
Size:: 53.93Kb

Name:: thesis.pdf
Size:: 1.305Mb

Show simple item record