Homepage Solution manuals Kevin P. Murphy Machine Learning: a Probabilistic Perspective Exercise 24.4 - Full conditionals for hierarchical model of Gaussian means

Exercise 24.4 - Full conditionals for hierarchical model of Gaussian means

Answers

Recall that the likelihood for the Gaussian-Gaussian model is:

p ( 𝜃 , 𝒟 | μ , τ 2 , σ 2 ) = ( j = 1 D 𝒩 ( 𝜃 j | μ , τ 2 ) ) ( j = 1 D i = 1 N j 𝒩 ( x 𝑖𝑗 | 𝜃 j , σ 2 ) ) .

We proceed from the conditional distribution of μ :

p ( μ | 𝜃 , 𝒟 , τ 2 , σ 2 ) = p ( μ , 𝜃 , 𝒟 , τ 2 , σ 2 ) p ( 𝜃 , 𝒟 , τ 2 , σ 2 ) p ( μ ) p ( 𝜃 | μ , τ 2 ) .

Taking exponential yields:

log p ( μ | 𝜃 , 𝒟 , τ 2 , σ 2 ) = ( μ μ 0 ) 2 2 γ 0 2 j = 1 D ( μ 𝜃 j ) 2 2 τ 2 .

Hence the conditional variance and mean are:

γ c = ( 1 γ 0 2 + D τ 2 ) 1 ,

μ c = γ c ( μ 0 γ 0 2 + j = 1 D 𝜃 j τ 2 ) ,

which are tantamount to (24.115).

For the centroids:

p ( 𝜃 k | 𝒟 , 𝜃 k , μ , τ 2 , σ 2 ) = p ( 𝜃 k , 𝒟 , 𝜃 k , μ , τ 2 , σ 2 ) p ( 𝒟 , 𝜃 k , μ , τ 2 , σ 2 ) p ( 𝜃 k | μ , τ 2 ) p ( 𝒟 | 𝜃 k , 𝜃 k , σ 2 ) exp { ( 𝜃 k μ ) 2 2 τ 2 } exp { i = 1 N k ( 𝜃 k x 𝑖𝑘 ) 2 2 σ 2 } .

We ends up with a Gaussian with variance:

τ k 2 = ( 1 τ 2 + N k σ 2 ) 1 ,

and mean:

μ j = τ k 2 ( μ τ 2 + i = 1 N k x 𝑖𝑗 σ 2 ) .

For the variance on the centroids:

p ( τ 2 | 𝜃 , 𝒟 , μ , σ 2 ) = p ( τ 2 , 𝜃 , 𝒟 , μ , σ 2 ) p ( 𝜃 , 𝒟 , μ , σ 2 ) p ( τ 2 ) j = 1 D p ( 𝜃 j | μ , τ 2 ) ( τ 2 ) η 0 2 1 exp { η 0 τ 0 2 2 τ 2 } ( τ 2 ) D 2 exp { j = 1 D ( 𝜃 j μ ) 2 2 τ 2 } .

Therefore the parameters for the conditionl IG distribution are:

η c = η 0 + D ,

τ c 2 = η 0 τ 0 2 + j = 1 D ( 𝜃 j μ ) 2 η c .

Finally, we handle the variance on the data:

p ( σ 2 | 𝜃 , 𝒟 , μ , τ 2 ) p ( σ 2 ) p ( 𝒟 | 𝜃 , σ 2 ) ( σ 2 ) v 0 2 1 exp { v 0 σ 0 2 2 σ 2 } ( σ 2 ) N 2 exp { j = 1 D i = 1 N j ( 𝜃 j x 𝑖𝑗 ) 2 2 σ 2 } ,

where N = j = 1 D N j . This is sufficient for (24.118).  

User profile picture
2021-03-24 13:42
Comments