SP Module 7 Pattern Matching

2022-11-24 17:13:09 浏览数 (3)

Cochlea, Mel-Scale, Filterbanks

From human speech perception to considerations for features for automatic speech recognition

Cochlea Different places along the cochlea respond to the incoming frequency.

Mel scale Nonlinearity in hertz scale, linear in mal scale, for cochlea.

Filter banks Simplify of cochlea is like a bank of bandpass filters. lower frequency limit and higher frequency limit.

Wider and wider in higher frequency. triangular filters is more appropriate than the rectangle one shown in figure above.

Feature vectors, sequences, and sequences of feature vectors

Representing speech as a sequence of feature vectors

Features Wavepoint is not useful,m magnitude spectrum (DFT) is better and spectrum envelop as feature is better better. To use spectrum envelop, we decide to use filter bank to encode feature envelop. Feature vector stores the encoded feature envelop or the feature banks.

sequences are everywhere in language

sequence of feature vectors

Exemplars and Distances

We start to look at the concepts of distance and alignment between sequences of speech data

Exemplar 范例: a stored feature vectors of a word. Distance (dissimilarity) between two sequence of feature vectors. Create alignment of exemplars and the unknown is the first step to calculate the global distance.

Pattern Matching, Alignment, Dynamic Time Warping

Search a grid with Dynamic Time Warping

Dynamic programming:

Dynamic time warping, pattern matching, aligning frames

More Dynamic Time Warping

1 人点赞