Continuous Hidden Markov Modeling. I am writing a speaker recognition program in Matlab using Mel Frequency Cepstral Coefficients, and although I have gotten the problem to work using discrete time wrapping, I was interested in try to successfully recognize speakers using Hidden Markov Modeling (HMM). I understand how to implement a HMM for simple examples (like urn and genie), but don't understand how it pertains to my problem exactly. For instance, what are actually the hidden states that the row vectors are trying to encode? And what does it mean to model these states as Gaussian mixtures?

raapjeqp

raapjeqp

Answered question

2022-11-01

Continuous Hidden Markov Modeling
I am writing a speaker recognition program in Matlab using Mel Frequency Cepstral Coefficients, and although I have gotten the problem to work using discrete time wrapping, I was interested in try to successfully recognize speakers using Hidden Markov Modeling (HMM). I understand how to implement a HMM for simple examples (like urn and genie), but don't understand how it pertains to my problem exactly. For instance, what are actually the hidden states that the row vectors are trying to encode? And what does it mean to model these states as Gaussian mixtures? For more information on MFCCs, it gives for each speaker a matrix of numbers, where each row corresponds to the power spectrum at one given time.

Answer & Explanation

driogairea1

driogairea1

Beginner2022-11-02Added 16 answers

Step 1
Look, HMM parameters include:
1. the initial probabilities π i . It means the probabilities for the model to be in state i at the very beginning of the process ( t = 0);
2. transition probability matrix M = [ a i j ]. Each entry is the probability of transition from state i to state j (it implies that the sum of every column have to be 1);
3. States { q i }. Every state is defined by the emission probability b i ( x ) = p ( x | q = q i ). That is the conditional probability to observe variable x when the model is in state q i .
Step 2
In your case every vector of MFCC (every column of your matrix) is one observation x R n where n is the coefficients vector length. Since the domain is continuous we have to use the probability density function to model the emission probability (pdf) b i ( x ) = p ( x | q = q i ). So the Gaussian mixture model:
p ( x , μ , Σ ) = i = 1 M 1 2 π n | Σ i | exp ( ( x μ i ) Σ 1 ( x μ i ) T )
where x , μ i R n are column vectors and Σ i are covariance matrices of size n × n
is widely used as such pdf.

Do you have a similar question?

Recalculate according to your conditions!

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?