A Framework Supporting Human-AI Adversarial Authorship: The Analysis of User Frustration to Improve System Efficiency
Date
2023-08-03Type of Degree
PhD DissertationDepartment
Computer Science and Software Engineering
Metadata
Show full item recordAbstract
Writing style can be traced back to a specific author with the use of authorship attribution techniques. These techniques use machine learning algorithms to classify the authors. This document discusses research focused on creating adversarial text to conceal an author's identity. A tool, AuthorCAAT, that performs adversarial authorship to assist in the anonymization of text using feature sets, language translations, and other transformation methods is utilized throughout this work. This tool is compared with other anonymization techniques while attempting to circumvent detection by high performing authorship attribution algorithms. We also explore combining anonymization methods to further improve the performance against the authorship attribution algorithms. This work is extended by using the anonymization methods in a partially observable environment where the authorship attribution algorithms are incorporated into the process of creating the adversarial text. Another focus of this work is developing a more general framework for adversarial authorship. This is done by first examining the problem points of our tool then designing a user interface that mitigates the issues that contribute to a poor user experience. We then work to improve on the components of the framework by utilizing document clustering, deep learning and model interpretability.