|dc.description.abstract||This dissertation is developed to improve survival analysis of heart transplant surgeries by first, investigating factors that are associated with the performance of predictive machine learning algorithms, then creating multiple experiments of predictions and selecting the best combination of the factors. These factors are multiple combinations of data preparation methods (imputation, encoding, and data cleaning), feature selection methods (filter, wrapper, and hybrid), resampling unbalanced data (synthesized oversampling, and undersampling), and well-known predictive algorithms (Logistic Regression, Linear Discriminant Analysis, Multivariate Adaptive Regression Spline, Neural Network, Naive Bayes, Support Vector Machines, eXtreme Gradient Boosting, Stochastic Gradient Boosting, and Random Forest).
The goal of the second part is introducing an approach that could deliver monotonic survival probabilities. Therefore, the predicted survival probabilities could be more easily interpreted by practitioners. The best model that yields the highest predictive performance in the first part of this study is considered for a post-calibration phase and producing monotonic predictions. In addition, a tool is developed to empower practitioners to employ the best predictive model for investigating the survival of patients with heart transplantation in a 10-year time window after the transplantation surgery.
This study is presented in five major sections. The first section highlights the necessity of considering a systematic approach for developing machine learning-based research with its application in transplant surgeries. The second section reviews the major challenges in predicting survival of heart transplantation surgeries. In the third section, a systematic approach based on Design of Experiments (DoE) for developing and optimizing predictive models in the heart transplantation domain is presented. In the fourth section, the identified challenges in heart transplantation studies are addressed. Conclusions, limitations, and future studies are discussed in the last section. To facilitate the reproduction of these results and to explain the details of the analysis, the codes and instructions are provided in the appendix.||en_US