Exercise 6.2 - James Stein estimator for Gaussian means

Answers

The prior for 𝜃 i is:

𝒩 ( 𝜃 i | m 0 , τ 0 2 ) ,

and the likelihood is given by:

𝒩 ( y i | 𝜃 i , σ 2 ) .

For question (a), we begin by integrating out 𝜃 i and establishing the dependency of 𝒟 = { y i } i = 1 6 on m 0 and τ 0 2 :

p ( y | m 0 , τ 0 2 ) = p ( y , 𝜃 | m 0 , τ 0 2 ) d 𝜃 = p ( y | 𝜃 σ 2 ) p ( 𝜃 | m 0 , τ 0 2 ) d 𝜃 = 1 2 π σ 2 τ 0 2 exp { 1 2 ( ( y 𝜃 ) 2 σ 2 + ( 𝜃 m 0 ) 2 τ 0 2 ) } d 𝜃 exp { 1 2 ( ( y σ 2 + m 0 τ 0 2 ) 2 1 σ 2 + 1 τ 0 2 + y 2 σ 2 ) } exp { ( a 𝜃 b ( y ) ) 2 } d 𝜃 exp { 1 2 ( ( y + m 0 σ 2 τ 0 2 ) 2 σ 2 + σ 4 τ 0 2 + y 2 σ 2 ) } = 𝒩 ( y | m 0 , σ 2 + τ 0 2 ) ,

where we have canceled terms independent from y and completing the square in the final step of deduction. Given σ 2 = 500 , we have:

m 0 ^ = x ¯ = 1527 . 5 ,

τ 0 2 ^ = 1878 . 58 σ 2 = 1378 . 58 ,

For question (b), the posterior distribution of 𝜃 i given y i and other hyperparameters is:

p ( 𝜃 | y ) p ( 𝜃 ) p ( y | 𝜃 ) exp { ( 𝜃 m 0 ) 2 2 τ 0 2 ( 𝜃 y ) 2 2 σ 2 } = 𝒩 ( 𝜃 | m 0 σ 2 + y τ 0 2 σ 2 + τ 0 2 , σ 2 τ 0 2 σ 2 + τ 0 2 ) .

For question (c), the interval is ( μ 1.96 σ , μ + 1.96 σ ) , where μ and σ are from question (b).

For question (d), a smaller σ 2 would reduce the ML-II into the ordinary posterior analysis. The parameter σ 2 can be understood as the noise on the observations. The less noise we assumed, the more precise the observations are, and the intermedia 𝜃 becomes less necessary.

User profile picture
2021-03-24 13:42
Comments