Exercise 21.4 - Variational lower bound for VB for GMMs

Answers

For variational GMM, the lower bound is given by:

𝔼 q [ log p ( 𝜃 , 𝒟 ) q ( 𝜃 ) ] = 𝔼 q [ log p ( 𝜃 , 𝒟 ) ] 𝔼 q [ log q ( 𝜃 ) ] = 𝔼 q [ log p ( 𝒟 | 𝜃 ) ] + 𝔼 q [ log p ( 𝜃 ) ] 𝔼 q [ log q ( 𝜃 ) ] = 𝔼 [ log p ( 𝐱 | 𝐳 , μ , Λ , π ) ] + 𝔼 [ log p ( 𝐳 , μ , Λ , π ) ] 𝔼 [ log q ( 𝐳 , μ , Λ , π ) ] = 𝔼 [ log p ( 𝐱 | 𝐳 , μ , Λ , π ) ] + 𝔼 [ log p ( 𝐳 | π ) ] + 𝔼 [ log p ( π ) ] + 𝔼 [ log p ( μ , Λ ) ] 𝔼 [ log q ( 𝐳 ) ] 𝔼 [ log q ( π ) ] 𝔼 [ log q ( μ , Λ ) ] .

Which involved Bayes rules and the decomposition form of the posterior and the variational distribution q . We now proceed to prove (21.209) through (21.215).

For (21.209):

𝔼 [ log p ( 𝐱 | 𝐳 , μ , Λ ) ] = 𝔼 q ( 𝐳 ) q ( μ , Λ ) [ log p ( 𝐱 | 𝐳 , μ , Λ ) ] = n k 𝔼 q ( 𝐳 ) q ( μ , Λ ) [ z k ( D 2 log 2 π + 1 2 log | Λ k | 1 2 ( 𝐱 n μ k ) T Λ k ( 𝐱 n μ k ) ) ] .

Plugging (21.131) and (21.132) into this form and we ends up with (21.209).

For (21.210):

𝔼 [ log p ( 𝐳 | π ) ] = 𝔼 q ( 𝐳 ) q ( π ) [ log p ( 𝐳 | π ) ] = 𝔼 q ( 𝐳 ) q ( π ) [ log n = 1 N k = 1 K π k z 𝑛𝑘 ] = n = 1 N k = 1 K 𝔼 q ( 𝐳 ) q ( π ) [ z 𝑛𝑘 log π k ] = n = 1 N k = 1 K 𝔼 q ( 𝐳 ) [ z 𝑛𝑘 ] 𝔼 q ( π ) [ log π k ] = n = 1 N k = 1 K r 𝑛𝑘 log π ¯ k ,

where we used the fact the the expectation of the product of independent random variables is the product of their expectations. The notations follows (21.129).

For (21.211):

𝔼 [ log p ( π ) ] = 𝔼 q ( π ) [ log p ( π ) ] = 𝔼 q ( π ) [ log ( const k = 1 K π k α 0 1 ) ] = ln const + ( α 0 1 ) k = 1 K log π ¯ k ,

plugging (21.216) finishes the proof.

For (21.212):

𝔼 [ log p ( μ , Λ ) ] = 𝔼 q ( μ , Λ ) [ log p ( μ , Λ ) ] = 𝔼 q ( μ , Λ ) [ log k = 1 K Wi ( Λ k | L 0 , v 0 ) 𝒩 ( μ k | m 0 , ( β 0 Λ k ) 1 ) ] = k = 1 K 𝔼 q ( μ , Λ ) [ log C + 1 2 ( v 0 D 1 ) log | Λ k | 1 2 tr { Λ k L 0 1 } D 2 log 2 π 1 2 log | β 0 Λ k | 1 2 ( μ k m 0 ) T ( β 0 Λ k ) ( μ k m 0 ) ] .

Where we have used (21.131) to expand the expected value of the quadratic form and used the fact that the mean of a Wi distribution is v k L k .

For (21.213):

𝔼 [ log q ( 𝐳 ) ] = 𝔼 q ( 𝐳 ) [ log q ( 𝐳 ) ] = 𝔼 q ( 𝐳 ) [ i k z 𝑖𝑘 log r 𝑖𝑘 ] = i k 𝔼 q ( 𝐳 ) [ z 𝑖𝑘 ] log r 𝑖𝑘 = i k r 𝑖𝑘 log r 𝑖𝑘 .

We only have to recall (21.124).

For (21.214):

𝔼 [ log q ( π ) ] = 𝔼 q ( π ) [ log q ( π ) ] = 𝔼 q ( π ) [ log C + k = 1 K ( α k 1 ) log π k ] = log C + k ( α k 1 ) log π ¯ k .

Finally, for (21.215):

𝔼 [ log q ( μ , Λ ) ] = 𝔼 q ( μ , Λ ) [ log q ( μ , Λ ) ] = k 𝔼 q ( μ , Λ ) [ log q ( Λ k ) D 2 log 2 π + 1 2 log | β k Λ k | 1 2 ( μ k m k ) T ( β k Λ k ) ( μ k m k ) ] .

Using (21.132) to expand the quadratic gives 𝔼 [ ( μ k m k ) T ( β k Λ k ) ( μ k m k ) ] = D .

User profile picture
2021-03-24 13:42
Comments