Homepage › Solution manuals › Kevin P. Murphy › Machine Learning: a Probabilistic Perspective › Exercise 17.3 - EM for HMMs with mixture of Gaussian observations
Exercise 17.3 - EM for HMMs with mixture of Gaussian observations
Answers
The complete likelihood takes the form:
where is the initial distribution of the hidden state, is a matrix with transition probability, is a matrix for , and are tensors whose -th component denote and respectively. Now its logarithm reads (we temporarily drop the condition on paramters for conciseness):
When being taken expectation w.r.t. , the only terms that matter are identical to those in exercise 17.1. This finishes the E-step for this model.
For the M-step, the update for and are identical to those for an ordinary HMM since their gradients remain the same. To update , note that their dependence on the auxiliary function is through:
This is tantamount to estimate the parameters for Gaussian components independently, with extra weight on each sample. Let us denote
so is determined from the old set of parameters, then consider the likelihood, in which we use a one-hot vector of length to embed the new latent variables:
Although tedious, one can consider as the complex index for data, as the complex index for Gaussian components. Now the internal auxiliary function reads (the evidence of the latent variables does not depend on the new parameters and is thus omitted):
where:
in which and can be computed as in an ordinary GMM by focusing on the Gaussian components under the hidden state . This concludes the E-step for the second auxiliary function. The M-step for the is thus similar to that for an ordinary, except for the introduction of an extra factor. This also completes the rest M-step for the first auxiliary function.