Exercise 21.9 - Variational EM for binary FA with sigmoid link

Answers

We begin with the likelihood:

p ( 𝐙 , 𝐗 | 𝐖 ) = n = 1 N j = 1 D sigm ( 𝐰 j T 𝐳 n ) x 𝑛𝑗 ( 1 sigm ( 𝐰 j T 𝐳 n ) ) ( 1 x 𝑛𝑗 ) .

The prior for the hidden variables is assumed to be:

n , p ( 𝐳 n ) = 𝒩 ( 0 , 𝐈 ) .

Assume the factorized variational distribution:

p ( 𝐖 , 𝐙 | 𝐗 ) q ( 𝐖 ) n = 1 N q ( 𝐳 n ) .

For the variational E-step, our goal is to match the logarithm of the variational distribution on the hidden variables:

log q ( 𝐳 ) ,

with:

𝔼 q ( 𝐖 ) [ log p ( 𝐙 , 𝐗 , 𝐖 ) ] = 𝔼 q ( 𝐖 ) [ n j x 𝑛𝑗 log sigm ( 𝐰 j T 𝐳 n ) + ( 1 x 𝑛𝑗 ) log ( 1 sigm ( 𝐰 j T 𝐳 n ) ) ] = n , j 𝔼 q ( 𝐖 ) [ x 𝑛𝑗 log sigm ( 𝐰 j T 𝐳 n ) 1 sigm ( 𝐰 j T 𝐳 n ) + log ( 1 sigm ( 𝐰 j T 𝐳 n ) ) ] = n , j 𝔼 q ( 𝐖 ) [ x 𝑛𝑗 𝐰 j T 𝐳 n + log ( 1 sigm ( 𝐰 j T 𝐳 n ) ) ] .

We can see that this form cannot painlessly reduce to an exponential family, hence approximation needs to be conducted to transfer log ( 1 sigm ( 𝐰 j T 𝐳 n ) ) to a linear function of 𝐳 n and optinally 𝐳 n T 𝐳 n (e.g., the Laplace approximation). Then we can see that 𝔼 q ( 𝐖 ) [ log p ( 𝐙 , 𝐗 , 𝐖 ) ] is a quadratic function in 𝐳 n , hence the E-step reduces q ( 𝐳 ) to a Gaussian.

For the variational M-step:

𝔼 q ( 𝐙 ) [ log p ( 𝐙 , 𝐗 , 𝐖 ) ]

can again be approximated as a quadratic function w.r.t. 𝐖 , where the expectation of 𝐳 shall be replaced by their counterpart in the E-step.

User profile picture
2021-03-24 13:42
Comments