Efficient Reliability Verification Testing in Open Source Software Using Priority Prediction
View/ Open
Date
2015-05-08Type of Degree
DissertationDepartment
Computer Science
Metadata
Show full item recordAbstract
Open source software is becoming a strong alternative to private development for a wide market of applications. There is a stigma against using open source software in the private sector because of licensing restrictions and the desire for closed source proprietary technologies. The idea of using open source software is often a more risky, though cheaper, alternative to private development. This dissertation presents a novel procedure for efficient reliability verification testing for open source software. The procedure utilizes both static and dynamic analysis to prioritize methods and variables within those methods for testing. Test case prioritization and minimization is used to create a more efficient process for identifying key areas in the code for error handling. Fault injection testing is done relative to the intended use cases of the system that will integrate the open source software. The results of the analysis and testing are used to calculate a metric of relative importance that is used to determine the highest priority locations for error detection and correction mechanisms. The metric data collected for the modules, functions and variables is used in conjunction with a two-step predictive model to reduce the total time needed to test the software to locate these critical variables in the system. The first stage uses binary classification and metric data about the modules and functions to identify which functions contain the most critical variables with respect to reliability against data faults. The second stage uses metric data about the variables in these functions and logistic regression to predict the relative importance of the variables. This testing procedure allows for greater flexibility in choice of open source software to integrate into an existing system through efficiently identifying key areas in the code that are the most susceptible to data faults that lead to system failure.