This Is AuburnElectronic Theses and Dissertations

Authorship Attribution via Evolutionary Hybridization of Sentiment Analysis, LIWC, and Topic Model Features




Gaston, Joshua

Type of Degree

Master's Thesis


Computer Science and Software Engineering


Authorship Attribution is a well-studied topic with deep roots in the field of Stylometry. This thesis examines three less traditional feature sets for the purpose of Authorship Attribution. Each feature set is examined alone as well as in combination with the other features. We examine the performance of features derived from Sentiment Analysis, LIWC (Linguistic Inquiry and Word Count), and Topic Models. Using methods from Multimodal Machine Learning, these feature sets are combined in an effort to improve the performance of Authorship Attribution systems. Then a feature selection method based on a Steady-State Genetic algorithm known as GEFeS (Genetic and Evolutionary Feature Selection) is used examine many different subsets of the total feature sets and further improve the performance of the Authorship Attribution Systems.