Exercise 3.13 - Posterior predictive distribution for a batch of data with the dirichlet-multinomial model

Answers

The likelihood for Dirichlet-multinomial model is:

p (𝒟 | 𝜃) = \prod_{k = 1}^{K} 𝜃_{k}^{N_{k}^{old}},

following the symbols defined in the textbook. The conjugate prior is the Dirichlet distribution:

p (𝜃 | α) = \frac{1}{B (α)} \cdot \prod_{k = 1}^{K} 𝜃_{k}^{α_{k} - 1},

where $𝜃$ is a $K$ -dimension simplex. The (3.37) in the textbook mistake $𝜃$ for $𝐱$ .

The posterior distribution is another Dirichlet distribution with update:

α_{k} + N_{k}^{old} \leftarrow α_{k} .

To predict a new batch of data $\tilde{𝒟}$ , we begin with one sample $x \in \tilde{𝒟}$ :

\begin{aligned} p (x = k | 𝒟, α) & = \int_{𝜃} p (x = k | 𝜃) \cdot p (𝜃 | 𝒟, α) d 𝜃 \\ = 𝔼_{Dir} [𝜃_{k}], \end{aligned}

where the expectation is computed w.r.t. the posterior Dirichlet distribution, hence is:

\frac{α_{k} + N_{k}^{old}}{\sum_{t = 1}^{K} α_{t} + N_{t}^{old}} .

Finally,

\begin{aligned} p (\tilde{𝒟} | 𝒟, α) & = \prod_{x \in \tilde{𝒟}} p (x | 𝒟, α) \\ = \prod_{k = 1}^{K} {(\frac{α_{k} + N_{k}^{old}}{\sum_{t = 1}^{K} α_{t} + N_{t}^{old}})}^{N_{k}^{new}} . \end{aligned}

solour_lfq

2021-03-24 13:42