Automatic Speaker Identification Using Vector Quantization
Shail Bala Jain
An automatic speaker identification scheme is purposed and developed, to identify or verify a person, by identifying his/her voice, using a novel method. All speaker identification system contains two main phases, training phase and the testing phase. In the training phase the features of the words spoken by different speakers are extracted and during the testing phase feature matching takes place. Feature extractor transforms the raw speech signal into a compact but effective representation that is more stable and discriminative than the original signal. The feature or the template thus extracted is stored in the database. During the recognition phase the extracted features are compared with the template in the database. In the purposed Speaker Identifier (SI) the features extracted are LPCC, Mel-Frequency Cepstrum coefficients (MFCC), Delta MFCC (DMFCC) and Delta-Delta MFCC (DDMFCC). Vector Quantization (VQ) is used for speaker modeling process. The final recognition decision is made based on the matching score: Speaker model with the smallest matching score is selected as a speaker of the test speech sample. Speaker identification rate was observed to be 96.59% in text independent case and increases by 3.5% in reference to text dependent, as we increase the feature vector size to 36 by including 12 DMFCC and 12 DDMFCC recognition rate gets increased by 0.4%. Better performances could be seen when applying this approach itself or mixed with Hidden Markov Model (HMM) in isolated-word speech recognition.