Homepage Solution manuals Kevin P. Murphy Machine Learning: a Probabilistic Perspective Exercise 13.3 - Derivation of fixed point updates for EB for linear regression

Exercise 13.3 - Derivation of fixed point updates for EB for linear regression

Answers

Instead of EM, this method directly optimize the posterior probability, whose negative logarithm is:

l ( α , β ) = log p ( 𝐘 | 𝐗 , α , β ) + j ( a log α j b α j ) + c log β d β .

By (4.126), marginalizing out 𝐰 yields:

p ( 𝐘 | 𝐗 , α , β ) = 𝒩 ( 𝐘 | 0 , Σ 𝐘 ) ,

where:

Σ 𝐘 = β 1 𝐈 N + 𝐗 T 𝐀 1 𝐗 .

With:

l ( α , β ) = 1 2 log | Σ 𝐘 | + 1 2 𝐘 T Σ 𝐘 1 𝐘 + j ( a log α j b α j ) + c log β d β .

We have (using the matrix inverse lemma):

𝐘 T Σ 𝐘 1 𝐘 α j = α j β 2 ( 𝐗 𝐘 ) T Σ E 𝐗 𝐘 = 1 2 μ E , j μ E , j ,

on the other hand, to find:

log | Σ Y | α j ,

note that:

Σ 𝐘 , m , n = j = 1 D x 𝑚𝑗 x 𝑛𝑗 α j 1 + β 1 𝕀 ( m = n ) .

Thus the term log | Σ Y | α j can be boiled down to γ α j 1 , hence the update is:

α j = 2 γ + 2 a μ E , j 2 + 2 b .

For β , the procedure is similar.

User profile picture
2021-03-24 13:42
Comments