Exercise 11.3 - EM for mixtures of Bernoullis

Answers

For the mixture of Bernoullis model, consider $K$ bases, from which each is a Bernoulli distribution:

Ber (x | 𝜃_{k}) = 𝜃_{k}^{𝕀 (x = 1)} \cdot {(1 - 𝜃_{k})}^{𝕀 (x = 0)} .

The auxiliary function, whom we are to optimize w.r.t. $𝜃$ is:

\begin{aligned} Q (𝜃, 𝜃^{old}) & = 𝔼_{p (𝐳 | 𝒟, 𝜃^{old})} [\sum_{n = 1}^{N} \log p (x_{n}, 𝐳_{n} | 𝜃)] \\ = \sum_{n = 1}^{N} \sum_{k = 1}^{K} 𝔼 [z_{𝑛𝑘}] \cdot (\log π_{k} + 𝕀 (x_{n} = 1) \log 𝜃_{k} + 𝕀 (x_{n} = 0) \log (1 - 𝜃_{k})) \\ = \sum_{n = 1}^{N} \sum_{k = 1}^{K} r_{𝑛𝑘} \cdot (\log π_{k} + 𝕀 (x_{n} = 1) 𝜃_{k} + 𝕀 (x_{n} = 0) (1 - 𝜃_{k})) . \end{aligned}

Taking differential w.r.t. $𝜃_{k}$

\frac{∂𝑄}{\partial 𝜃_{k}} = \sum_{n = 1}^{N} r_{𝑛𝑘} \cdot (𝕀 (x_{n} = 1) \frac{1}{𝜃_{k}} - 𝕀 (x_{n} = 0) \frac{1}{1 - 𝜃_{k}}),

set it to zero:

𝜃_{k} = \frac{\sum_{n = 1}^{N} r_{𝑛𝑘} 𝕀 (x_{n} = 1)}{\sum_{n = 1}^{N} r_{𝑛𝑘}} .

This is exactly (11.116) modules $α$ -reduction.

If a $Beta (α_{k}, β_{k})$ prior is introduced for each base then we introduce $α_{k} - 1$ positive samples and $β_{k} - 1$ negative samples into the computation, this is tantamount to setting $r_{𝑛𝑘} = 1$ for $n = N + 1, \dots, N + α_{k} + β_{k} - 2$ , so:

𝜃_{k} = \frac{\sum_{n = 1}^{N} r_{𝑛𝑘} 𝕀 (x_{n} = 1) + α_{k} - 1}{\sum_{n = 1}^{N} r_{𝑛𝑘} + α_{k} + β_{k} - 2} .

At this point one might wonder the necessity of introducing a mixture of Bernoullis. Unlike the mixture of Gaussians, that of Bernoullis seems less convincing. Let $𝜃$ denotes the weighted average of base models:

𝜃 = \sum_{k} π_{k} 𝜃_{k},

then the variance of the mixture model remains $𝜃 - 𝜃^{2}$ . There is no need of using a mixture of Bernoullis (regardin prediction) unless we have to explicitly model a scenario in which there has to be a mixture structure. For example, if we were told that a binary string is generated from a set of unbalanced coins where each coin has different dynamics and we are asked to tell which coin generates some specific toss. But even this scenario might lead to abnormality, considering a coin that always yields head and another that always yields tail.

solour_lfq

2021-03-24 13:42

Exercise 11.3 - EM for mixtures of Bernoullis

Answers

Comments

Add answer