This Is AuburnElectronic Theses and Dissertations

Nonparametric Methods for Classification and Related Feature Selection Procedures




Yin, Shuxin

Type of Degree



Mathematics and Statistics


One important application of gene expression microarray data is classification of samples into categories, such as types of tumor. Gene selection procedures become crucial since gene expression data from DNA microarrays are characterized by thousands measured genes on only a few subjects. Not all these genes are thought to determine a specific genetic trait. In this dissertation, I develop a novel nonparametric procedure for selecting such genes. This rank-based forward selection procedure rewards genes for their contribution towards determining the trait but penalizes them for their similarity to genes that are already selected. I will show that my method gives lower misclassification error rates than the dimension reduction methods such as principal component analysis and partial least square analysis. I also explore more properties of Wilcoxon-Mann-Whitney (WMW) statistic and propose a new classifier based on WMW to reduce the misclassification error rate. Real data analysis and Monte Carlo simulation demonstrate the superiority of the proposed methods to the classical methods in several situations.