Homepage › Solution manuals › Kevin P. Murphy › Machine Learning: a Probabilistic Perspective › Exercise 12.3 - Heuristic for assessing applicability of PCA
Exercise 12.3 - Heuristic for assessing applicability of PCA
Answers
We derive this heuristics from an information theory’s perspective. Recall that the differential entropy for a MVN is (with ):
After PCA, the covariance for this MVN model is obatined by replacing the smallest variances into , hence the difference in entropy is:
For two eigen series with the same mean , it is plausible to expect that the product of the largest values in the series with a larger variance is larger, hence the information loss is smaller, making the PCA better regarding information compression.