Exercise 12.11 - PPCA vs FA

Answers

import math 
import numpy as np 
from numpy.linalg import eig 
import matplotlib.pyplot as plt 
from mpl_toolkits.mplot3d.axes3d import Axes3D 
np.random.seed(520) 
z1=np.random.normal(0,1,size=200) 
z2=np.random.normal(0,1,size=200) 
z3=np.random.normal(0,1,size=200) 
x1=z1 
x2=z1+0.001*z2 
x3=10*z3 
fig=plt.figure() 
axes3d=Axes3D(fig) 
axes3d.view_init(elev=20., azim=30) 
axes3d.scatter(z1,z2,z3,color="cyan") 
axes3d.scatter(x1,x2,x3,color="red") 
Z=np.vstack((z1,z2,z3)) 
z1m=np.sum(z1)/200 
z2m=np.sum(z2)/200 
z3m=np.sum(z3)/200 
zm=np.array([z1m,z2m,z3m]) 
for j in range(3): 
   for i in range(200): 
       Z[j][i]=Z[j][i]-zm[j] 
 
X=np.vstack((x1,x2,x3)) 
x1m=np.sum(x1)/200 
x2m=np.sum(x2)/200 
x3m=np.sum(x3)/200 
xm=np.array([x1m,x2m,x3m]) 
for j in range(3): 
   for i in range(200): 
       X[j][i]=X[j][i]-xm[j] 
SZ=Z@Z.T/200 
SX=X@X.T/200 
vals,vecs=eig(SX) 
print(vals)

The scatters are shown in the following figure:

[P1]PIC [P2]PIC
[P3]PIC

Figure 1: Exercise 12.11.

Where the red scatters are 𝐗 and the cyan ones are 𝐙 . PCA would select dimension 3 as the principal component since it has the largest variance. PPCA would select the same if σ 2 is small enough. Otherwise, σ 2 would be estimated as approximately 100 3 , hence the reduced variance is larger for the first and the second dimension.

User profile picture
2021-03-24 13:42
Comments