Design of a Speaker Recognition Code Using Matlab
Essay by review • November 17, 2010 • Research Paper • 3,158 Words (13 Pages) • 3,004 Views
Design of a Speaker Recognition Code using MATLAB
E. Darren Ellis
Department of Computer and Electrical Engineering - University of Tennessee, Knoxville
Tennessee 37996
(Submitted: 09 May 2001)
This project entails the design of a speaker recognition code using MATLAB. Signal
processing in the time and frequency domain yields a powerful method for analysis.
MATLAB's built in functions for frequency domain analysis as well as its
straightforward programming interface makes it an ideal tool for speech analysis projects.
For the current project, experience was gained in general MATLAB programming and
the manipulation of time domain and frequency domain signals. Speech editing was
performed as well as degradation of signals by the application of Gaussian noise.
Background noise was successfully removed from a signal by the application of a 3rd
order Butterworth filter. A code was then constructed to compare the pitch and formant
of a known speech file to 83 unknown speech files and choose the top twelve matches.
I. INTRODUCTION
Development of speaker identification systems began as early as the 1960s with
exploration into voiceprint analysis, where characteristics of an individual's voice were
thought to be able to characterize the uniqueness of an individual much like a fingerprint.
The early systems had many flaws and research ensued to derive a more reliable method
of predicting the correlation between two sets of speech utterances. Speaker
identification research continues today under the realm of the field of digital signal
processing where many advances have taken place in recent years.
In the current design project a basic speaker identification algorithm has been
written to sort through a list of files and choose the 12 most likely matches based on the
average pitch of the speech utterance as well as the location of the formants in the
frequency domain representation. In addition, experience has been gained in basic
filtering of high frequency noise signals with the use of a Butterworth filter as well as
speech editing techniques.
II. APPROACH
This multi faceted design project can be categorized into different sections:
speech editing, speech degradation, speech enhancement, pitch analysis, formant analysis
and waveform comparison. The resulting discussion will be segmented based on these
delineations.
SPEECH EDITING
The file recorded with my slower speech (a17.wav) was found from the ordered
list of speakers. A plot of this file is shown in Figure (1). It was determined that the
length of the vector representing this speech file had a magnitude of 30,000. Thus the
vector was partitioned into two separate vectors of equal length and the vectors were
written to a file in opposite order. The file was then read and played back. The code for
this process can be found in Appendix A.
SPEECH DEGRADATION
The file recorded with my faster speech (a18.wav) was found from the ordered list
of speakers. Speech degradation was performed by adding Gaussian noise generated by
the MATLAB function randn() to this file. A comparison was then made between the
clean file and the signal with the addition of Gaussian noise. The code for this process
can be found in Appendix B.
0 0.5 1 1.5 2 2.5 3 3.5 4
-0.08
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
0.08
0.1
0.12
Time, (s)
Amplitude
Plot of a17.wav in the time domain
Fig 1. Time domain plot of a17.wav.
SPEECH ENHANCEMENT
The file recorded with my slower speech and noise in the background (a71.wav)
was found from the ordered list of speakers. A plot of this file is shown in Figure (2).
This signal was then converted to the frequency domain through the use of a shifted FFT
and correctly scaled frequency vector. The higher frequency noise
0 0 . 5 1 1 . 5 2 2 . 5 3 3 . 5 4
- 0 . 3
- 0 . 2
- 0 . 1
0
0 . 1
0 . 2
0 . 3
0 . 4
Time, (s)
Amplitude
Plot of a71.wav in the time domain
Fig 2. Time domain plot of a71.wav.
components were then removed by application of a 3rd order
...
...