PianoMentor: A Technological Framework to Teach and Practice Piano/Keyboard Online via Machine Learning with an Embedded Practice Lesson Generator

Jamshidi, Fatemeh

View/Open

Fatemeh_Jamshidi_PianoMentor_Dissertation.pdf (1.919Mb)

Date

2024-07-30

Author

Jamshidi, Fatemeh

Type of Degree

PhD Dissertation

Department

Computer Science and Software Engineering

Restriction Status

EMBARGOED

Restriction Type

Auburn University Users

Date Available

07-30-2025

Metadata

Show full item record

Abstract

In this dissertation, we aim to create an advanced artificial system that enhances its ability to perceive, coordinate, and analyze music scores generated from human performances in real time and revolutionizes online music technique learning. By harnessing cutting-edge technology, this system aims to improve existing music practice methods and foster a new era of self-directed learning among students, thereby significantly impacting music education. Artificial Intelligence (AI) and Human-Computer Interaction (HCI) have significantly advanced computer music systems, enabling them to collaborate with humans across various applications. This dissertation employs various techniques, particularly machine learning and deep learning algorithms, to offer music students and educators a more intuitive and effective experience. The research focuses on three main aspects of human-computer collaborative music practice: 1- Automating music transcription to extract pitch and timing information. 2- Generating real-time performance analysis reports as annotated music scores. 3- Developing practice exercises based on students' performances while adapting different practice strategies. We utilize a framewise pitch detection model based on pitch onset predictions to achieve these goals, allowing a new note to start only when the onset detector confirms its existence. This integrated approach to improving onsets and offsets aligns better with human musical perception. Additionally, we propose a system that facilitates the visualization and comparison of MIDI files, addressing the abstract nature of MIDI data. Depending on user groups and tasks, we present various visualizations, including card lists for viewing multiple MIDI files, heatmaps for note distribution, a Note Histogram for note occurrence counts, pitch-time charts, adapted MatrixWave visualizations for note sequences, and diagram designs for visualizing similarities between pattern representations of sequences. This system is implemented as a browser application and evaluated using a usage scenario, demonstrating its effectiveness for the specified user tasks and highlighting some limitations. Finally, we explore the potential of large-scale language models (LLMs) like ChatGPT, ChatMusician, and MusicLang in generating relevant practice exercises based on students' performance. Unlike traditional methods requiring deep musical and statistical knowledge, LLMs enable users to describe their musical desires directly. We evaluate LLMs' composing abilities based on their alignment with user input and the overall quality of their compositions, considering factors like repetitiveness and scale diversity. This research aims to underscore LLMs' strengths, applications, and limitations in music, paving the way for their expanded role in computer-assisted composition and the broader music industry.

URI

https://etd.auburn.edu//handle/10415/9401