On the Machine Illusion
Type of DegreePhD Dissertation
Computer Science and Software Engineering
MetadataShow full item record
In this work, we empirically study an emerging problem in the machine learning community, i.e., the adversarial samples. Specifically, we focus on the discussion within the realm of neural networks. The existence of adversarial samples reveals yet another inconsistency in our hypothesis about neural networks. An adversarial sample is usually generated by adding very small and carefully chosen noise to a clean data sample, e.g., adding noise to an image to change some pixel values, replacing a few words in a sentence. Despite that they are almost the same (visually or semantically) as the clean samples from the perspective of human beings, the adversarial samples will trick a well-trained neural network into wrong predictions with very high confidence. In addition, we also show that adversarial samples exist in real-world when the objects are in an unusual pose (e.g., a flipped-over school bus). We study this problem from two sides of the coin, i.e., defending against adversarial samples and generating adversarial samples. Concretely, to defend against adversarial samples, we propose a binary classification method to filter out adversarial samples. It achieves almost perfect accuracy on adversarial samples from seen distributions. However, it fails to recognize adversarial samples from unseen distributions. To generate of adversarial samples, we first propose a framework to generate text adversarial samples for text classification problem (e.g., sentimental analysis). The framework generates high-quality text adversarial samples. The limitation is that we do not have explicit control over the semantics and syntax. In addition, we propose another framework to generate image adversarial samples by rendering 3D objects in unusual poses. It shows that natural adversarials in real-world may exist in abundance. What’s lacking in this dissertation is a theoretical exploration of this problem. We may revisit this problem when theories behind neural networks get matured.