Exercise 7.7 - Sufficient statistics for online linear regression

Answers

For question (a), according to (7.99), w 1 can be estimated from C 𝑥𝑦 ( n ) and C 𝑥𝑥 ( n ) solely.

For question (b), according to (7.100), w 0 can be estimated from x ¯ ( n ) , ȳ ( n ) and w 1 . Hence x ¯ ( n ) , ȳ ( n ) , C 𝑥𝑦 ( n ) and C 𝑥𝑥 ( n ) are necessary.

For question (c), the solution has been given by (7.103)-(7.104). For y , the matter is simply an α -reduction.

For question (d), we need to prove:

( n + 1 ) C 𝑥𝑦 ( n + 1 ) = n C 𝑥𝑦 ( n ) + x n + 1 y n + 1 + n x ¯ ( n ) ȳ ( n ) ( n + 1 ) x ¯ ( n + 1 ) ȳ ( n + 1 ) .

Expand the C 𝑥𝑦 in both two sides then the l.h.s. becomes:

i = 1 n + 1 ( x i x ¯ ( n + 1 ) ) ( y i ȳ ( n + 1 ) ) ,

the r.h.s. becomes:

i = 1 n ( x i x ¯ ( n ) ) ( y i ȳ ( n ) ) + x n + 1 y n + 1 + n x ¯ ( n ) ȳ ( n ) ( n + 1 ) x ¯ ( n + 1 ) ȳ ( n + 1 ) .

Finally, substituting the average at the ( n + 1 ) -th iteration in the l.h.s. and r.h.s. by (7.104), this is sufficient for arriving in (7.105).

For question (e) and (f):

import math 
import matplotlib.pyplot as plt 
x=[94,96,94,95,104,106,108,113,115,121,131] 
y=[0.47,0.75,0.83,0.98,1.18,1.29,1.40,1.60,1.75,1.90,2.23] 
bx=(x[0]+x[1])/2 
by=(y[0]+y[1])/2 
Cxx=((x[0]-bx)**2+(x[1]-bx)**2)/2 
Cxy=((y[0]-by)*(x[0]-bx)+(y[1]-by)*(x[1]-bx))/2 
w1=[] 
w0=[] 
for n in list(range(2,11)): 
   bx_=bx+(x[n]-bx)/(n+1) 
   by_=by+(y[n]-by)/(n+1) 
   Cxx=(x[n]*x[n]+n*Cxx+n*bx*bx-(n+1)*bx_*bx_)/(n+1) 
   Cxy=(x[n]*y[n]+n*Cxy+n*bx*by-(n+1)*bx_*by_)/(n+1) 
   bx=bx_ 
   by=by_ 
   w1.append(Cxy/Cxx) 
   w0.append(by-w1[n-2]*bx)

With:

PIC

Figure 1: Exercise 7.7. P1

PIC

Figure 2: Exercise 7.7. P2

Where we borrow data from exercise 7.8 for illustration. The data in x and y appear in sequential order for Figure. 7. 14. That is to say, if we shuffle x and y accordingly, the estimation for weights is expected to converge faster.

User profile picture
2021-03-24 13:42
Comments