Zero-Shot Multi-Label Topic Inference

Sarkar, Souvika

View/Open

Dissertation___Souvika.pdf (4.036Mb)

Date

2024-08-15

Author

Sarkar, Souvika

Type of Degree

PhD Dissertation

Department

Computer Science and Software Engineering

Restriction Status

EMBARGOED

Restriction Type

Full

Date Available

08-15-2029

Metadata

Show full item record

Abstract

Despite significant progress in NLP research, we still lack a general-purpose inference tool that can effectively serve users from a wide range of application domains. One way to address this challenge is to create supervised training examples with custom-defined topics of interest as labels from each user and then train a classifier on those labels to infer topics. However, creating custom training examples is costly and time-consuming, and the scarcity of high-quality training examples presents a great challenge for data-hungry deep-learning models. Therefore, the machine learning (ML) community has recently been pushing toward zero-shot and transfer learning approaches. In this thesis, we discuss a cardinal yet relatively unexplored NLP task called “Zero-Shot Multi-Label Topic Inference”, which infers topics from documents where documents and topics were never seen previously by a model. As no benchmark dataset was readily available for this task, first, we created two real-world datasets, i.e., “News Concept Dataset” and “Medical Concept Dataset,” to provide a way to evaluate this task. In the next phase, we performed a detailed study on how to leverage SOTA sentence encoders and Large Language Models (LLMs) for the task, where the topics are defined/provided by the users in real time. These models have indeed been shown to achieve superior performance for many downstream text-mining tasks and, thus, have been claimed to be fairly general. Through extensive experiments, we designed and developed 1) various embedding procedures, 2) multiple Zero-shot Topic Inference methods, 3) presented a comparative study of the state-of-the-art sentence encoders and LLMs, 4) evaluated Zero-shot models from a Multi-User Perspective, and 5) a case study on limitations of the SOTA sentence encoders. Finally, we demonstrate potential applications of Zero-shot Topic Inference in two distinct domains: a) Policy-making in healthcare management and b) Evaluating student performance. We also discuss the potential implications of zero-shot models for low-resource languages and outline our future plans. This work presents a transformative potential of leveraging zero-shot and transfer learning principles to create more flexible, responsive, and broadly applicable AI systems, ultimately bridging the gap between rigid pre-trained models and the dynamic requirements of real-world applications.

URI

https://etd.auburn.edu//handle/10415/9470