Enhancing User Experience through Improving the User Interface of Phonetics Tools and Studies on Phone-level ASR-based Automation through Deep Learning Techniques

Ren, Chang

View/Open

Chang_Dissertation_Submission0801.pdf (19.64Mb)

Date

2023-08-04

Author

Ren, Chang

Type of Degree

PhD Dissertation

Department

Computer Science and Software Engineering

Metadata

Show full item record

Abstract

The research includes three studies at the intersection of communications disorders and computational linguistics. We begin with the case study of APTgt, a system created to im- prove reinforcement for Phonetics students and improve Linguistic tools for their instructions. A portion of this system utilizes machine learning techniques (i.e., Multi-class classification) to automatically generate exams. After the utilization of this learning technology, we endeav- ored to enhance the user experience by automatically transcribing user speech into phoneme level in research Grapheme-to-phoneme (G2P) conversion from English text to IPA format to support phonetic transcription and automatic exam generation. From the literature, we have seen support for standard speech through G2P but have found no evidence of support for disordered speech. We utilize Automatic Speech Recognition (ASR) with deep learning techniques to recognize disordered speech. This study will improve user experience and user interface design and incorporate deep learning techniques to provide phonetic transcription for disordered speech. Deep learning techniques were utilized to support the development of a Speech-to-IPA module for disordered speech and increase user efficiency by generating a large number of phonetic transcription exam resources as a word bank for exam development.

URI

https://etd.auburn.edu//handle/10415/8915