Exercise 21.4 - Variational lower bound for VB for GMMs

Answers

For variational GMM, the lower bound is given by:

\begin{align} 𝔼_{q} [\log \frac{p (𝜃, 𝒟)}{q (𝜃)}] = & 𝔼_{q} [\log p (𝜃, 𝒟)] - 𝔼_{q} [\log q (𝜃)] \\ = & 𝔼_{q} [\log p (𝒟 | 𝜃)] + 𝔼_{q} [\log p (𝜃)] - 𝔼_{q} [\log q (𝜃)] \\ = & 𝔼 [\log p (𝐱 | 𝐳, μ, Λ, π)] + 𝔼 [\log p (𝐳, μ, Λ, π)] \\ - 𝔼 [\log q (𝐳, μ, Λ, π)] \\ = & 𝔼 [\log p (𝐱 | 𝐳, μ, Λ, π)] + 𝔼 [\log p (𝐳 | π)] + 𝔼 [\log p (π)] + 𝔼 [\log p (μ, Λ)] \\ - 𝔼 [\log q (𝐳)] - 𝔼 [\log q (π)] - 𝔼 [\log q (μ, Λ)] . \end{align}

Which involved Bayes rules and the decomposition form of the posterior and the variational distribution $q$ . We now proceed to prove (21.209) through (21.215).

For (21.209):

\begin{aligned} 𝔼 [\log p (𝐱 | 𝐳, μ, Λ)] = & 𝔼_{q (𝐳) q (μ, Λ)} [\log p (𝐱 | 𝐳, μ, Λ)] \\ = & \sum_{n} \sum_{k} 𝔼_{q (𝐳) q (μ, Λ)} [z_{k} (- \frac{D}{2} \log 2 π + \frac{1}{2} \log | Λ_{k} | - \frac{1}{2} {(𝐱_{n} - μ_{k})}^{T} Λ_{k} (𝐱_{n} - μ_{k}))] . \end{aligned}

Plugging (21.131) and (21.132) into this form and we ends up with (21.209).

For (21.210):

\begin{aligned} 𝔼 [\log p (𝐳 | π)] = & 𝔼_{q (𝐳) q (π)} [\log p (𝐳 | π)] \\ = & 𝔼_{q (𝐳) q (π)} [\log \prod_{n = 1}^{N} \prod_{k = 1}^{K} π_{k}^{z_{𝑛𝑘}}] \\ = & \sum_{n = 1}^{N} \sum_{k = 1}^{K} 𝔼_{q (𝐳) q (π)} [z_{𝑛𝑘} \log π_{k}] \\ = & \sum_{n = 1}^{N} \sum_{k = 1}^{K} 𝔼_{q (𝐳)} [z_{𝑛𝑘}] \cdot 𝔼_{q (π)} [\log π_{k}] \\ = & \sum_{n = 1}^{N} \sum_{k = 1}^{K} r_{𝑛𝑘} \log {\bar{π}}_{k}, \end{aligned}

where we used the fact the the expectation of the product of independent random variables is the product of their expectations. The notations follows (21.129).

For (21.211):

\begin{aligned} 𝔼 [\log p (π)] = & 𝔼_{q (π)} [\log p (π)] \\ = & 𝔼_{q (π)} [\log (const \cdot \prod_{k = 1}^{K} π_{k}^{α_{0} - 1})] \\ = & \ln const + (α_{0} - 1) \sum_{k = 1}^{K} \log {\bar{π}}_{k}, \end{aligned}

plugging (21.216) finishes the proof.

For (21.212):

\begin{aligned} 𝔼 [\log p (μ, Λ)] = & 𝔼_{q (μ, Λ)} [\log p (μ, Λ)] \\ = & 𝔼_{q (μ, Λ)} [\log \prod_{k = 1}^{K} Wi (Λ_{k} | L_{0}, v_{0}) \cdot 𝒩 (μ_{k} | m_{0}, {(β_{0} Λ_{k})}^{- 1})] \\ = & \sum_{k = 1}^{K} 𝔼_{q (μ, Λ)} [\log C + \frac{1}{2} (v_{0} - D - 1) \log | Λ_{k} | - \frac{1}{2} tr {Λ_{k} L_{0}^{- 1}} \\ - \frac{D}{2} \log 2 π - \frac{1}{2} \log | β_{0} Λ_{k} | - \frac{1}{2} {(μ_{k} - m_{0})}^{T} (β_{0} Λ_{k}) (μ_{k} - m_{0})] . \end{aligned}

Where we have used (21.131) to expand the expected value of the quadratic form and used the fact that the mean of a Wi distribution is $v_{k} L_{k}$ .

For (21.213):

\begin{aligned} 𝔼 [\log q (𝐳)] = & 𝔼_{q (𝐳)} [\log q (𝐳)] \\ = & 𝔼_{q (𝐳)} [\sum_{i} \sum_{k} z_{𝑖𝑘} \log r_{𝑖𝑘}] \\ = & \sum_{i} \sum_{k} 𝔼_{q (𝐳)} [z_{𝑖𝑘}] \cdot \log r_{𝑖𝑘} \\ = & \sum_{i} \sum_{k} r_{𝑖𝑘} \log r_{𝑖𝑘} . \end{aligned}

We only have to recall (21.124).

For (21.214):

\begin{aligned} 𝔼 [\log q (π)] = & 𝔼_{q (π)} [\log q (π)] \\ = & 𝔼_{q (π)} [\log C + \sum_{k = 1}^{K} (α_{k} - 1) \log π_{k}] \\ = & \log C + \sum_{k} (α_{k} - 1) \log {\bar{π}}_{k} . \end{aligned}

Finally, for (21.215):

\begin{aligned} 𝔼 [\log q (μ, Λ)] = & 𝔼_{q (μ, Λ)} [\log q (μ, Λ)] \\ = & \sum_{k} 𝔼_{q (μ, Λ)} [\log q (Λ_{k}) - \frac{D}{2} \log 2 π + \frac{1}{2} \log | β_{k} Λ_{k} | \\ - \frac{1}{2} {(μ_{k} - m_{k})}^{T} (β_{k} Λ_{k}) (μ_{k} - m_{k})] . \end{aligned}

Using (21.132) to expand the quadratic gives $𝔼 [{(μ_{k} - m_{k})}^{T} (β_{k} Λ_{k}) (μ_{k} - m_{k})] = D$ .

solour_lfq

2021-03-24 13:42

Exercise 21.4 - Variational lower bound for VB for GMMs

Answers

Comments

Add answer