Improving Prediction Accuracy Using Class-specific Ensemble Feature Selection

Soares, Caio

Metadata Field	Value	Language
dc.contributor.advisor	Gilbert, Juan
dc.contributor.advisor	Dozier, Gerry
dc.contributor.advisor	Seals, Cheryl
dc.contributor.author	Soares, Caio
dc.date.accessioned	2010-08-03T13:12:56Z
dc.date.available	2010-08-03T13:12:56Z
dc.date.issued	2010-08-03T13:12:56Z
dc.identifier.uri	http://hdl.handle.net/10415/2273
dc.description.abstract	As data accumulates at a speed significantly faster than can be processed, data preprocessing techniques such as feature selection become increasingly important and beneficial. Moreover, given the well-known gains of feature selection, any further improvements can positively affect a wide array of fields and applications. So, this research explores a novel feature selection architecture, Class-specific Ensemble Feature Selection (CEFS), which finds class-specific subsets of features optimal to each available classification in the dataset. Each subset is then combined with a classifier to create an ensemble feature selection model which is further used to predict unseen instances. CEFS attempts to provide the diversity and base classifier disagreement sought after in effective ensemble models by providing highly useful, yet highly exclusive feature subsets. CEFS is not a feature selection algorithm, but rather, a unique way of performing feature selection. Hence, it is also algorithm independent, suggesting that various machine learners and feature selection algorithms can benefit from the use of this architecture. To test this architecture, a comprehensive experiment is conducted, implementing the architecture under two different classifiers, three different feature selection algorithms, and under ten different datasets. The results of this experiment shows that the CEFS architecture outperforms the traditional feature selection architecture in every algorithmic combination and for every dataset. Moreover, the presence of high-dimensional datasets suggests that CEFS will scale up. Finally, the feature results obtained from the experiment suggest that vital class-specific information can be lost if feature selection is performed on the entire dataset as a whole, as opposed to a class-specific manner.	en
dc.rights	EMBARGO_NOT_AUBURN	en
dc.subject	Computer Science	en
dc.title	Improving Prediction Accuracy Using Class-specific Ensemble Feature Selection	en
dc.type	dissertation	en
dc.embargo.length	NO_RESTRICTION	en_US
dc.embargo.status	NOT_EMBARGOED	en_US

Files in this item

Name:: Dissertation_(Draft_13).pdf.txt
Size:: 239.2Kb

Name:: Dissertation_(Draft_13).pdf
Size:: 1.664Mb

Show simple item record