- AUETD Home
- View Item
Classifying Speakers Using Voice Biometrics In a Multimodal World
Date
2009-07-30Author
Rouse, Kenneth
Type of Degree
dissertationDepartment
Computer ScienceMetadata
Show full item recordAbstract
The following dissertation document is a research study conducted to determine whether
a classi cation for a person is obtainable by using the person's voice. The intent of this
work was to investigate a collection of voice samples for trends that potentially lead to
parameters to be used in the classi cation of an individual. No classi cation area was
sought speci cally; for instance gender or ethnicity, as it was preferred to allow the results
to dictate the characteristics that point to a particular classi cation group. In the data
collection stage, each participant was given the same task and then analysis was done on
the voice sample given. Analysis was conducted in phases, with the rst phase focusing on
the time domain which resulted with parameters approximating speed of speech and the
amount of pauses in the sample. Next the frequency domain was investigated focusing on
the complexity of speech and voice tone attributes. The results of the inquiries into this
domain concluded with the peaks, in the frequency of the voice, being tracked by frequency
threads and represented numerically by a third order polynomial. It is the coe cients of
this polynomial that give a representation of an individual's voice, making it possible to
classify them to a particular group. To verify this, the coe cients from these polynomials
iv
were used with a clustering application to validate the hypotheses of this study, substantiating
an objective to provide empirical user data to contribute to the design of future phone
system communications.