Exercise 24.4 - Full conditionals for hierarchical model of Gaussian means

Answers

Recall that the likelihood for the Gaussian-Gaussian model is:

p (𝜃, 𝒟 | μ, τ^{2}, σ^{2}) = (\prod_{j = 1}^{D} 𝒩 (𝜃_{j} | μ, τ^{2})) \cdot (\prod_{j = 1}^{D} \prod_{i = 1}^{N_{j}} 𝒩 (x_{𝑖𝑗} | 𝜃_{j}, σ^{2})) .

We proceed from the conditional distribution of $μ$ :

\begin{aligned} p (μ | 𝜃, 𝒟, τ^{2}, σ^{2}) & = \frac{p (μ, 𝜃, 𝒟, τ^{2}, σ^{2})}{p (𝜃, 𝒟, τ^{2}, σ^{2})} \\ \propto p (μ) \cdot p (𝜃 | μ, τ^{2}) . \end{aligned}

Taking exponential yields:

\log p (μ | 𝜃, 𝒟, τ^{2}, σ^{2}) = - \frac{{(μ - μ_{0})}^{2}}{2 γ_{0}^{2}} - \sum_{j = 1}^{D} \frac{{(μ - 𝜃_{j})}^{2}}{2 τ^{2}} .

Hence the conditional variance and mean are:

γ_{c} = {(\frac{1}{γ_{0}^{2}} + \frac{D}{τ^{2}})}^{- 1},

μ_{c} = γ_{c} (\frac{μ_{0}}{γ_{0}^{2}} + \sum_{j = 1}^{D} \frac{𝜃_{j}}{τ^{2}}),

which are tantamount to (24.115).

For the centroids:

\begin{aligned} p (𝜃_{k} | 𝒟, 𝜃_{- k}, μ, τ^{2}, σ^{2}) & = \frac{p (𝜃_{k}, 𝒟, 𝜃_{- k}, μ, τ^{2}, σ^{2})}{p (𝒟, 𝜃_{- k}, μ, τ^{2}, σ^{2})} \\ \propto p (𝜃_{k} | μ, τ^{2}) \cdot p (𝒟 | 𝜃_{k}, 𝜃_{- k}, σ^{2}) \\ \propto \exp {- \frac{{(𝜃_{k} - μ)}^{2}}{2 τ^{2}}} \\ \cdot \exp {- \sum_{i = 1}^{N_{k}} \frac{{(𝜃_{k} - x_{𝑖𝑘})}^{2}}{2 σ^{2}}} . \end{aligned}

We ends up with a Gaussian with variance:

τ_{k}^{2} = {(\frac{1}{τ^{2}} + \frac{N_{k}}{σ^{2}})}^{- 1},

and mean:

μ_{j} = τ_{k}^{2} (\frac{μ}{τ^{2}} + \sum_{i = 1}^{N_{k}} \frac{x_{𝑖𝑗}}{σ^{2}}) .

For the variance on the centroids:

\begin{aligned} p (τ^{2} | 𝜃, 𝒟, μ, σ^{2}) & = \frac{p (τ^{2}, 𝜃, 𝒟, μ, σ^{2})}{p (𝜃, 𝒟, μ, σ^{2})} \\ \propto p (τ^{2}) \cdot \prod_{j = 1}^{D} p (𝜃_{j} | μ, τ^{2}) \\ \propto {(τ^{2})}^{- \frac{η_{0}}{2} - 1} \cdot \exp {- \frac{η_{0} τ_{0}^{2}}{2 τ^{2}}} \\ \cdot {(τ^{2})}^{- \frac{D}{2}} \cdot \exp {- \frac{\sum_{j = 1}^{D} {(𝜃_{j} - μ)}^{2}}{2 τ^{2}}} . \end{aligned}

Therefore the parameters for the conditionl IG distribution are:

η_{c} = η_{0} + D,

τ_{c}^{2} = \frac{η_{0} τ_{0}^{2} + ∑_{j = 1}^{D} {(𝜃_{j} - μ)}^{2}}{η_{c}} .

Finally, we handle the variance on the data:

\begin{aligned} p (σ^{2} | 𝜃, 𝒟, μ, τ^{2}) & \propto p (σ^{2}) \cdot p (𝒟 | 𝜃, σ^{2}) \\ \propto {(σ^{2})}^{- \frac{v_{0}}{2} - 1} \cdot \exp {- \frac{v_{0} σ_{0}^{2}}{2 σ^{2}}} \\ \cdot {(σ^{2})}^{- \frac{N}{2}} \cdot \exp {\sum_{j = 1}^{D} \sum_{i = 1}^{N_{j}} - \frac{{(𝜃_{j} - x_{𝑖𝑗})}^{2}}{2 σ^{2}}}, \end{aligned}

where $N = \sum_{j = 1}^{D} N_{j}$ . This is sufficient for (24.118).

solour_lfq

2021-03-24 13:42