Homepage › Solution manuals › Kevin P. Murphy › Machine Learning: a Probabilistic Perspective › Exercise 5.8 - MLE and model selection for a 2d discrete distribution
Exercise 5.8 - MLE and model selection for a 2d discrete distribution
Answers
For question (a), the joint distribution is given by:
This can be compactly written as:
where is the Exclusive NOR operator.
For question (b), the MLE for is , while that for is . Since we assumed the independency between and , both MLE can be arrived at by simply counting. The evidence is given by:
For question (c), the MLE for is computed by normalizing the counting vector: , so:
The evidence is:
For question (d):
import math x=[1,1,0,1,1,0,0] y=[1,0,0,0,1,0,1] l2=0 l4=0 e=1/10**5 for shadow in range(7): temp1=0 temp2=0 temp00=0 temp10=0 temp01=0 temp11=0 for i in range(len(x)): if i==shadow: continue if x[i]==1: temp1=temp1+1 if x[i]==y[i]: temp2=temp2+1 if x[i]==0 and y[i]==0: temp00=temp00+1 if x[i]==0 and y[i]==1: temp01=temp01+1 if x[i]==1 and y[i]==0: temp10=temp10+1 if x[i]==1 and y[i]==1: temp11=temp11+1 theta_1=temp1/(len(x)-1) theta_2=temp2/(len(x)-1) s=temp00+temp01+temp10+temp11 theta_00=temp00/s theta_01=temp01/s theta_10=temp10/s theta_11=temp11/s p2=theta_1**(x[shadow])*(1-theta_1)**(1-x[shadow])*theta_2**(1-x[shadow]^y[shadow])*(1-theta_2)**(x[shadow]^y[shadow]) p4=theta_00**(x[shadow]==0 and y[shadow==0])*theta_01**(x[shadow]==0 and y[shadow==1])*theta_10**(x[shadow]==1 and y[shadow==0])*theta_11**(x[shadow]==1 and y[shadow==1]) l2=l2+math.log(p2+e) l4=l4+math.log(p4+e) print(l2) print(l4)
The result is:
-12.136441189337646 -28.04302596169576
Hence the CV will pick . The reason behind is that assumes zero probability for during the cross-validation, which significantly declines the confidence.
For question (e), the BICs for and are respectively:
Hence the BIC prefers as well.