Homepage › Solution manuals › Kevin P. Murphy › Machine Learning: a Probabilistic Perspective › Exercise 12.11 - PPCA vs FA
Exercise 12.11 - PPCA vs FA
Answers
import math import numpy as np from numpy.linalg import eig import matplotlib.pyplot as plt from mpl_toolkits.mplot3d.axes3d import Axes3D np.random.seed(520) z1=np.random.normal(0,1,size=200) z2=np.random.normal(0,1,size=200) z3=np.random.normal(0,1,size=200) x1=z1 x2=z1+0.001*z2 x3=10*z3 fig=plt.figure() axes3d=Axes3D(fig) axes3d.view_init(elev=20., azim=30) axes3d.scatter(z1,z2,z3,color="cyan") axes3d.scatter(x1,x2,x3,color="red") Z=np.vstack((z1,z2,z3)) z1m=np.sum(z1)/200 z2m=np.sum(z2)/200 z3m=np.sum(z3)/200 zm=np.array([z1m,z2m,z3m]) for j in range(3): for i in range(200): Z[j][i]=Z[j][i]-zm[j] X=np.vstack((x1,x2,x3)) x1m=np.sum(x1)/200 x2m=np.sum(x2)/200 x3m=np.sum(x3)/200 xm=np.array([x1m,x2m,x3m]) for j in range(3): for i in range(200): X[j][i]=X[j][i]-xm[j] SZ=Z@Z.T/200 SX=X@X.T/200 vals,vecs=eig(SX) print(vals)
The scatters are shown in the following figure:
[P1]
[P2]
[P3]
Where the red scatters are and the cyan ones are . PCA would select dimension 3 as the principal component since it has the largest variance. PPCA would select the same if is small enough. Otherwise, would be estimated as approximately , hence the reduced variance is larger for the first and the second dimension.