This Is AuburnElectronic Theses and Dissertations

Show simple item record

Go with the flow: Data Flow Analysis for Binary Differencing


Metadata FieldValueLanguage
dc.contributor.advisorUmphress, David A.
dc.contributor.advisorCross, James H., II
dc.contributor.advisorOverbey, Jeffrey
dc.contributor.advisorHamilton, Michael
dc.contributor.authorDenton, Benjamin
dc.date.accessioned2014-07-30T15:40:08Z
dc.date.available2014-07-30T15:40:08Z
dc.date.issued2014-07-30
dc.identifier.urihttp://hdl.handle.net/10415/4323
dc.description.abstractDifferencing in computer science is often used to quickly determine differences between two files. While this works well for plain text files, such as source code, applying differencing to binary executable files is more difficult. Compiled binary files contain lists of instructions that when executed, perform operations using functions and data specified by a higher level programming language, such as C++. Syntactic changes to these instructions, changes in the form of an instruction, do not always reflect semantic changes, changes in the behavior of an instruction. Depending on the settings and optimizations of the compiler, a series of instructions from a binary executable could perform the same function as a different series of instructions from a different binary. These types of differences are difficult to detect using current binary differencing methods. This dissertation explores software reverse engineering, binary differencing, and software semantics vs. syntax. We define a framework, we call Data Flow Binary Differencing, for performing binary differencing using data flow analysis and comparing the semantics of the data flow within a pair of functions. We discuss three use cases that illustrate how to implement the Data Flow Binary Differencing framework and show how our technique stands up against challenges faced by other binary differencing techniques. Our major contribution of this research is using data flow and assembly language semantics to define a method to compare a pair of functions and test for similarities. We also discover that testing for semantic differences versus syntactic differences within a binary can expose semantic differences introduced by an optimizing compiler.en_US
dc.subjectComputer Scienceen_US
dc.titleGo with the flow: Data Flow Analysis for Binary Differencingen_US
dc.typedissertationen_US
dc.embargo.statusNOT_EMBARGOEDen_US

Files in this item

Show simple item record