homework6_ZhankunLuo
阿新 • • 發佈:2018-12-19
Zhankun Luo
PUID: 0031195279
Fall-2018-ECE-59500-009
Instructor: Toma Hentea
Homework 6
##Chap 7Task
Dynamic Time Warping in Speech Recognition
Experiment with the matlab script IsoDigitRec.m to match template recordings of digits zero.wav, one.wav, …
to a set of template patterns provided in the textbook’s software
###small fixes to IsoDigitRec.m
ind=strfind(curDir,'\');
is changed toind=strfind(curDir,'/');
[x, Fs, bits] = wavread()
will be removed, so replace them with[x,Fs]=audioread
- Change
protoNames={'zero', ...}
toprotoNames={'zero.wav',...}
accordingly
fixed IsoDigitRec.m
% IsoDigitRec.m (Example 5.4) % "Introduction to Pattern Recognition: A MATLAB Approach" % S. Theodoridis, A. Pikrakis, K. Koutroumbas, D. Cavouras % At a first step, the data folder of Chapter 5 is appended to the existing % MATLAB path. curDir=pwd; ind=strfind(curDir,'/'); curDir(ind(end)+1:end)=[]; addpath([curDir 'data'],'-end'); close('all'); clear; % To build the system, we will use short-term Energy and short-term Zero- % Crossing Rate (Section 7.5.4, [Theo 09]) as features, so that each signal is rep- % resented by a sequence of two-dimensional feature vectors. Note that this is not % an optimal feature set in any sense and it has only been adopted on the basis of % simplicity. The feature extraction stage is accomplished by typing the following % code: protoNames={'zero.wav','one.wav','two.wav','three.wav','four.wav','five.wav','six.wav','seven.wav','eight.wav','nine.wav'}; for i=1:length(protoNames) [x,Fs]=audioread(protoNames{i}); winlength = round(0.02*Fs); % 20 ms moving window length winstep = winlength; % moving window step. No overlap [E,T]=stEnergy(x,Fs,winlength,winstep); [Zcr,T]=stZeroCrossingRate(x,Fs,winlength,winstep); protoFSeq{i}=[E;Zcr]; end % To find the best match for an unknown pattern, say a pattern stored in file % "upattern1.wav", type the following code: [test,Fs]=audioread('upattern1.wav'); winlength = round(0.02*Fs); % use the same values as before winstep = winlength; [E,T]=stEnergy(test,Fs,winlength,winstep); [Zcr,T]=stZeroCrossingRate(test,Fs,winlength,winstep); Ftest=[E;Zcr]; tolerance=0.1; LeftEndConstr=round(tolerance/winstep); % left endpoint constraint RightEndConstr = LeftEndConstr; for i=1:length(protoNames) [MatchingCost(i),BestPath{i},D{i},Pred{i}]=DTWSakoeEndp(protoFSeq{i},Ftest,LeftEndConstr,RightEndConstr,0); end [minCost,indexofBest]=min(MatchingCost); fprintf('The unknown pattern has been identified as a "%s" \n',protoNames{indexofBest});
Result for Patterns
Change [test,Fs]=audioread('upattern1.wav');
to [test,Fs]=audioread('upattern02.wav');
, etc.
Then get Result for Patterns:
Name of Pattern | Identified as |
---|---|
upattern1.wav | zero.wav |
upattern02.wav | zero.wav |
upattern11.wav | zero.wav |
upattern12.wav | one.wav |
upattern13.wav | one.wav |
upattern14.wav | three.wav |
upattern15.wav | zero.wav |
upattern16.wav | four.wav |
upattern17.wav | four.wav |
upattern21.wav | three.wav |
upattern22.wav | two.wav |
upattern23.wav | two.wav |
upattern51.wav | five.wav |
upattern61.wav | six.wav |
Chap 8
Task
HMM recognition and training
Run example633.m, example634.m, example635.m and example636.m
Fix
put BackTracking.m
of Chap 5 into Chap 6 function&example folder
Because MultSeqTrainDoHMMVITsc.m
use function BackTracking.m
content of functions
% CHAPTER 6: m-files
%
% BWDoHMMsc - Computes the recognition probability of a HMM, given a sequence of % discrete observations, by means of the scaled version of the Baum- % Welch (any-path) method
% BWDoHMMst - Same as BWDoHMMSc, except that no scaling is employed.
% MultSeqTrainCoHMMBWsc - Baum-Welch training (scaled version) of a Continuous Observation
% HMM, given multiple training sequences. Each sequence
% consists of l-dimensional feature vectors.
% It is assumed that the pdf associated with each state
% is a multivariate Gaussian mixture.
% MultSeqTrainDoHMMBWsc - Baum-Welch training (scaled version) of a Discrete Observation
% HMM, given multiple training sequences.
% MultSeqTrainDoHMMVITsc - Viterbi training (scaled version) of a Discrete ObservationHMM,given
% multiple training sequences.
% VitCoHMMsc - Computes the scaledViterbi score of aHMM,given a sequence of l- % dimensional vectors
% of continuous observations, under the assumption that the pdf
% of each state is a Gaussian mixture.
% VitCoHMMst - Same as VitCoHMMsc except that no scaling is employed.
% VitDoHMMsc - Computes the scaled Viterbi score of a Discrete Observation HMM,
% given a sequence of observations.
% VitDoHMMst - Same as VitDoHMMsc, except that no scaling is employed.
Result for example633.m
epoch = 1
epoch = 2
piTrained_1 =
0.7141
0.2859
ATrained_1 =
0.6743 0.3257
0.6746 0.3254
BTrained_1 =
0.7672 0.3544
0.2328 0.6456
% press any key
epoch = 1
epoch = 2
epoch = 3
epoch = 4
epoch = 5
epoch = 6
epoch = 7
epoch = 8
epoch = 9
epoch = 10
epoch = 11
epoch = 12
epoch = 13
piTrained_2 =
1
0
ATrained_2 =
1.0000 0.0000
0 1.0000
BTrained_2 =
0.6333 0
0.3667 1.0000
Result for example634.m
theEpoch = 1
theEpoch = 2
piTrained_1 =
0.6857
0.3143
ATrained_1 =
0.6278 0.3722
0.6288 0.3712
BTrained_1 =
1 0
0 1
Result for example635.m
epoch = 1
epoch = 2
piTrained_1 =
0.7141
0.2859
ATrained_1 =
0.6743 0.3257
0.6746 0.3254
BTrained_1 =
0.7672 0.3544
0.2328 0.6456
% press any key
epoch = 1
epoch = 2
epoch = 3
epoch = 4
epoch = 5
epoch = 6
epoch = 7
epoch = 8
epoch = 9
epoch = 10
epoch = 11
epoch = 12
epoch = 13
piTrained_2 =
1
0
ATrained_2 =
1.0000 0.0000
0 1.0000
BTrained_2 =
0.6333 0
0.3667 1.0000
Result for example636.m
Pr1 = -8.8513
Pr2 = -15.1390
bs1 =
1 1 1 1 1 1 1 2 2 2 2 2 2
bs2 =
1 2 2 2 2 2 2 2 2 2 2 2 2