Homepage › Solution manuals › Kevin P. Murphy › Machine Learning: a Probabilistic Perspective › Exercise 2.15 - MLE minimizes KL divergence to the empirical distribution
Exercise 2.15 - MLE minimizes KL divergence to the empirical distribution
Answers
Expand the KL divergence:
We use the weak law of large numbers in the third step and drop the entropy of empirical distribution, which is independent of , in the last step. The other direction of optimization is . It contains an expectation term w.r.t. and is harder to solve.