This Is AuburnElectronic Theses and Dissertations

Optimization Approaches to Identifying Personalized Colorectal Cancer Screening Strategies




Young, David

Type of Degree

PhD Dissertation


Chemical Engineering

Restriction Status


Restriction Type

Auburn University Users

Date Available



This dissertation investigates optimization approaches for solving a problem coined the colorectal cancer screening problem (CRCSP). This problem aims to identify a screening strategy for testing an asymptomatic population to maximize the benefit of screening through either monetary or societal means. The CRCSP includes two types of uncertainty, exogenous and endogenous uncertainty, where the decision variables impact the probabilities associated with the uncertainty outcomes. The decisions all have discrete values making the problem at best a mixed integer problem. We investigated solving the CRCSP through two different approaches. The first approach, simulation optimization, integrates a microsimulation (MSM) model within a derivative-free optimization (DFO) framework to search for the optimal screening strategy for the simulated population. The second approach uses mathematical programming, specifically stochastic programming (SP), as the framework to model and solve the problem. To implement the simulation-optimization approach, an MSM of the colorectal cancer progression was first reconstructed, and its outputs were verified using literature data. A comprehensive study of DFO solvers was then performed to identify the solver that is best suited to efficiently and reliably solve combinatorial optimization problems. It was found that the commercial solver TOMLAB/glcSolve, an implementation of the dividing hyper-rectangles (DIRECT) algorithm, performs best for combinatorial problems with a low number of decision variables. The next best solver is the derivative-free line search (DLF) algorithm. This solver shows a similar performance to that of glcSolve. As the number of decision variables increases, however, DFL outperforms glcSolve. Using glcSolve, an optimal screening strategy was identified that provided a 31% improvement in quality-adjusted life-years (QALY) gained while only increasing the overall costs by 5 % compared to the currently recommended strategy. The DFO framework was then used to analyze the impact of different simulation assumptions and parameter values on the overall optimal screening strategy and the optimal solution. The study showed that the optimum screening strategy identified was most sensitive to the changes in the relative risk (RR) associated with the transition probability of colorectal cancer (CRC), followed by the compliance modeling and willingness to pay ratio (WTP). Changes in the RR changed screening frequency and the screening starting and ending ages. Compliance modeling mainly impacted the screening modality. When WTP changed, the screening start age was affected. The CRCSP was modeled in two different ways using an SP framework. The first way was to represent the CRCSP as a two-stage SP (TSSP), where the type I endogenous uncertainty caused the problem to be a mixed integer non-linear program (MINLP). Two different direct linearization procedures were then applied to the MINLP, and the resulting models and their solution times were assessed, finding a size versus solution time trade-off for the two linearization procedures. The solution of times of both models were found to be significantly dependent on the size of the scenario set, leading to an investigation of efficient scenario set construction methodologies to best represent the rare-event region of the uncertainty space. The results revealed that the distance-based clustering methods, k-means and x-means, provided very stable and accurate scenario sets compared to a number of sampling schemes. The second approach to the CRCSP represented the problem as a multi-stage SP (MSSP). The uncertainty was modeled differently than in the TSSP to maintain computational tractability. However, this change led to not having a closed-form expression to calculate the benefits of screening. This problem was overcome by integrating a machine learning (ML) model within the MSSP formulation to estimate the screening benefits. The data to train the ML model was gathered from the MSM reconstructed from literature. The resulting formulation allowed for a computationally tractable and easily solvable MSSP for the CRCSP.