This Is AuburnElectronic Theses and Dissertations

Show simple item record

Assessment of Multiple MapReduce Strategies for Fast Analytics of Small Files


Metadata FieldValueLanguage
dc.contributor.advisorYu, Weikuanen_US
dc.contributor.authorZhou, Fangen_US
dc.date.accessioned2015-05-05T20:28:22Z
dc.date.available2015-05-05T20:28:22Z
dc.date.issued2015-05-05
dc.identifier.urihttp://hdl.handle.net/10415/4565
dc.description.abstractHadoop, an open-source implementation of MapReduce, is used widely because of its ease of programming, scalability, and availability. Hadoop distributed file system (HDFS) and Hadoop MapReduce are two important components of Hadoop. Hadoop MapReduce is used to process the data stored in HDFS. With the explosive development of cloud computing, increasingly business and scientific needs to take advantages of Hadoop. The sizes of files processed in Hadoop are not bound to very large files any more. Large amount of small files both in business and scientific area are processed by MapReduce, such as document type files, bioinformatics files, geographic information files, and so on. In this situation, MapReduce performance of Hadoop is impacted severely. Although Hadoop itself and other frameworks provide some MapReduce strategies, they are not directly designed for small files. In addition, there is no theoretical analysis for evaluating MapReduce strategies for small files. In this paper, I conduct an analysis of existing different MapReduce strategies for small files and use theoretical and empirical methods to conclude what the best MapReduce strategy is for processing small files. The experimental results show the correctness and efficiency of our analysis.​en_US
dc.subjectComputer Scienceen_US
dc.titleAssessment of Multiple MapReduce Strategies for Fast Analytics of Small Filesen_US
dc.typeMaster's Thesisen_US
dc.embargo.statusNOT_EMBARGOEDen_US
dc.contributor.committeeBiaz, Saaden_US
dc.contributor.committeeYilmaz, Leventen_US
dc.contributor.committeeBaskiyar, Sanjeeven_US

Files in this item

Show simple item record