Improving Interpretability and Accuracy of Artificial Intelligence in Natural Language and Image Understanding

Pham, Thang

View/Open

PhD_Dissertation_ThangPham.pdf (23.64Mb)

Date

2024-07-22

Author

Pham, Thang

Type of Degree

PhD Dissertation

Department

Computer Science and Software Engineering

Metadata

Show full item record

Abstract

Transformer-based models, such as BERT, have revolutionized natural language processing (NLP) by setting new standards for accuracy and capability. BERT has achieved state-of-the-art (SOTA) results on many NLP tasks and benchmarks, such as GLUE, even surpassing human performance. Despite these successes, there remain questions about whether these models truly understand natural languages like humans do. Our findings reveal that BERT-based classifiers often disregard the sequential order of words when evaluated on GLUE. Using LIME, an attribution method to visualize how much a token contributes towards models’ prediction, we find that instead of understanding sentence meaning, these models rely on superficial cues. Additionally, incorporating BERT into attribution methods to yield more plausible counterfactuals when interpreting text classifiers has proven problematic. Our research shows that BERT is not particularly useful in this context unless the attribution method, such as LIME, produces out-of-distribution samples. The limitations of current NLP benchmarks like GLUE are evident as they do not require models to understand the surrounding context before making predictions. To address this, we introduce the Phrase-in-Context (PiC) benchmark. PiC forces models to comprehend the context first before interpreting the meaning of a phrase, as the meaning is context-dependent. This benchmark poses a significant challenge to models, with even GPT-4 (as of March 2023) only achieving a 64–75% accuracy. Moreover, recognizing the value of text-based explanations, we propose using part-based object descriptors generated by GPT-4 to explain image classification systems. By grounding these texts into specific regions of an object image, we aim to enhance both interpretability and performance. This approach is exemplified by PEEB, a part-based image classifier also based on transformer. PEEB translates class names into descriptive texts, matching visual parts to these descriptions for improved classification. It significantly outperforms existing explainable models, particularly in zero-shot settings, and allows for classifier customization without retraining, thereby advancing both interpretability and accuracy in fine-grained image classification.

URI

https://etd.auburn.edu//handle/10415/9330