TCP/IP Implementation of Hadoop Acceleration
Metadata Field | Value | Language |
---|---|---|
dc.contributor.advisor | Yu, Weikuan | |
dc.contributor.author | Xu, Cong | |
dc.date.accessioned | 2012-05-18T13:44:02Z | |
dc.date.available | 2012-05-18T13:44:02Z | |
dc.date.issued | 2012-05-18 | |
dc.identifier.uri | http://hdl.handle.net/10415/3167 | |
dc.description.abstract | Cloud Computing is a booming technology in computer science. Since Google released the design details of the MapReduce technique in 2004 [1], cloud computing has been more and more popular. Hadoop [2] has been developed as an open-source implementation of MapReduce. A new network-levitated merge mechanism (Hadoop-A) [3] improves the existing Hadoop framework to solve many problems in the original framework. Hadoop-A avoids repetitive merging of data and introduces a full pipeline that consists of shuffle, merge and reduce phases. However, Hadoop-A is implemented based on Infiniband RDMA technology, which is not commonly deployed on commercial servers. On the other hand, data transmission based on the TCP/IP protocol is a robust technology, its speed is becoming faster and faster. Thus, we deem that it worthwhile to complement our RDMA-based connection with an implementation that is built on TCP/IP protocol. In this article, I will describe the details of design and implementation of a TCP/IP Implementation of Hadoop-A. Two components MOFSupplier (Server) and NetMerger (Client) are introduced to realize the TCP/IP connection, which can fetch data from Maptasks and send them to Reducetasks within the new network-levitated merge mechanism. Multithreading technologies are used to manage memory pool, send/receive and merge data segments. The experiment results show that the TCP/IP implementation can bring good performance for Hadoop-A on TCP/IP. Its execution time outperforms original Hadoop by 26.7% and can also achieve good scalability. | en_US |
dc.rights | EMBARGO_NOT_AUBURN | en_US |
dc.subject | Computer Science | en_US |
dc.title | TCP/IP Implementation of Hadoop Acceleration | en_US |
dc.type | thesis | en_US |
dc.embargo.length | NO_RESTRICTION | en_US |
dc.embargo.status | NOT_EMBARGOED | en_US |