Homepage › Solution manuals › Kevin P. Murphy › Machine Learning: a Probabilistic Perspective › Exercise 7.7 - Sufficient statistics for online linear regression
Exercise 7.7 - Sufficient statistics for online linear regression
Answers
For question (a), according to (7.99), can be estimated from and solely.
For question (b), according to (7.100), can be estimated from , and . Hence , , and are necessary.
For question (c), the solution has been given by (7.103)-(7.104). For , the matter is simply an -reduction.
For question (d), we need to prove:
Expand the in both two sides then the l.h.s. becomes:
the r.h.s. becomes:
Finally, substituting the average at the -th iteration in the l.h.s. and r.h.s. by (7.104), this is sufficient for arriving in (7.105).
For question (e) and (f):
import math import matplotlib.pyplot as plt x=[94,96,94,95,104,106,108,113,115,121,131] y=[0.47,0.75,0.83,0.98,1.18,1.29,1.40,1.60,1.75,1.90,2.23] bx=(x[0]+x[1])/2 by=(y[0]+y[1])/2 Cxx=((x[0]-bx)**2+(x[1]-bx)**2)/2 Cxy=((y[0]-by)*(x[0]-bx)+(y[1]-by)*(x[1]-bx))/2 w1=[] w0=[] for n in list(range(2,11)): bx_=bx+(x[n]-bx)/(n+1) by_=by+(y[n]-by)/(n+1) Cxx=(x[n]*x[n]+n*Cxx+n*bx*bx-(n+1)*bx_*bx_)/(n+1) Cxy=(x[n]*y[n]+n*Cxy+n*bx*by-(n+1)*bx_*by_)/(n+1) bx=bx_ by=by_ w1.append(Cxy/Cxx) w0.append(by-w1[n-2]*bx)
With:


Where we borrow data from exercise 7.8 for illustration. The data in and appear in sequential order for Figure. 7. 14. That is to say, if we shuffle and accordingly, the estimation for weights is expected to converge faster.