Exercise 11.13 - EM for EB estimation of Gaussian shrinkage model

Answers

This is an example of non-mixture latent graphical model. In this case the latent variable is no longer the one-hot type, making it different from the EM forms that we have developed.

Recall that the complete likelihood for Gaussian shrinkage model is:

\begin{aligned} p (𝜃, 𝒟 | μ, τ^{2}, {σ_{j}^{2}}_{j = 1}^{D}) & = p (𝜃 | μ, τ^{2}) \cdot p (𝒟 | 𝜃, {σ_{j}^{2}}_{j = 1}^{D}) \\ = \prod_{j = 1}^{D} [𝒩 (𝜃_{j} | μ, τ^{2}) \prod_{i = 1}^{N_{j}} \cdot 𝒩 (x_{𝑖𝑗} | 𝜃_{j}, σ_{j}^{2})] . \end{aligned}

Taking logarithm yields:

\begin{aligned} \log p (𝜃, 𝒟 | μ, τ^{2}, {σ_{j}^{2}}_{j = 1}^{D}) & = \sum_{j = 1}^{D} [\log 𝒩 (𝜃_{j} | μ, τ^{2}) + \sum_{i = 1}^{N_{j}} \log 𝒩 (x_{𝑖𝑗} | 𝜃_{j}, σ_{j}^{2})] \\ = \sum_{j = 1}^{D} [- \frac{1}{2} \log 2 π τ^{2} - \frac{1}{2 τ^{2}} {(𝜃_{j} - μ)}^{2}] \\ + \sum_{j = 1}^{D} \sum_{i = 1}^{N_{j}} [- \frac{1}{2} \log 2 π σ_{j}^{2} - \frac{1}{2 σ_{j}^{2}} {(x_{𝑖𝑗} - 𝜃_{j})}^{2}] \\ = - \frac{D}{2} \log 2 π τ^{2} - \frac{\sum_{j = 1}^{D} {(𝜃_{j} - μ)}^{2}}{2 τ^{2}} + \sum_{j = 1}^{D} [- \frac{N_{j}}{2} \log 2 π σ_{j}^{2}] \\ - \sum_{j = 1}^{D} \sum_{i = 1}^{N_{j}} \frac{{(x_{𝑖𝑗} - 𝜃_{j})}^{2}}{2 σ_{j}^{2}} . \end{aligned}

Note that $p (𝜃, 𝒟 | μ, τ^{2}, σ_{j}^{2} s)$ is essentially Gaussian, hence the posterior over $𝜃$ can be analytically written down with (4.125) (though tedious). Hence all terms that dependent on $𝜃$ in the logarithm of the complete likelihood can be estimated as moments of their posterior. This is possible since all such terms taking the form $𝜃_{j}$ or $𝜃_{j}^{2}$ . This completes the E-step.

For the M-step, this model is not different from others we have developed so far. Taking partial gradient w.r.t. $μ$ and $τ^{2}$ and setting them to zero would yield the update rules.

solour_lfq

2021-03-24 13:42

Exercise 11.13 - EM for EB estimation of Gaussian shrinkage model

Answers

Comments

Add answer