Exercise 3.4

Answers

(a)
We have y = Xw + 𝜖, take this into the expression for in-sample estimate of y is ŷ, we have

ŷ = Hy = H(Xw + 𝜖) = Hxw + H𝜖 = X(XTX)1XTXw + H𝜖 = Xw + H𝜖
(b)
The in-sample error vector is:

ŷ y = Xw + H𝜖 (Xw + 𝜖) = (H I)𝜖
(c)

Ein(wlin) = 1 NXwlin y2 = 1 Ny ŷ2 = 1 N(I H)𝜖2 = 1 N𝜖T(I H)T(I H)𝜖 = 1 N𝜖T(I HT)(I H)𝜖 = 1 N𝜖T(I H)(I H)𝜖 = 1 N𝜖T(I H)𝜖
(d)

ED [Ein(wlin)] = ED [ 1 N𝜖T(I H)𝜖] = 1 N (ED[𝜖T𝜖] E D[𝜖TH𝜖]) = 1 N (ED[ k=1N𝜖 k2] E D[ i=1N j=1N𝜖 ihij𝜖j]) = 1 N ( k=1NE D𝜖k2 i=1N j=1NE D[𝜖ihij𝜖j]) = 1 N (Nσ2 i=1NE D[𝜖i2h ii]) = 1 N (Nσ2 i=1Nh iiED[𝜖i2]) = 1 N (Nσ2 σ2trace(H)) = σ2 (1 trace(H) N ) = σ2 (1 d + 1 N )

Here we used the independence of 𝜖i, and also we assume H is not random variable, so we are not doing expectation w.r.t. x, that’s how we can pull hii out of ED[𝜖i2hii].

(e)
Since X doesn’t change, only 𝜖 changes, we have

ŷ y = Xw + H𝜖 (Xw + 𝜖) = H𝜖 𝜖

Following the procedure in problems (c) and (d), we have

ED,𝜖 [Etest(wlin)] = ED,𝜖 [ 1 Nyŷ2] = ED,𝜖 [ 1 N𝜖 H𝜖2] = 1 NED,𝜖 [(𝜖 H𝜖)T(𝜖 H𝜖)] = 1 NED,𝜖 [(𝜖′T 𝜖THT)(𝜖 H𝜖)] = 1 NED,𝜖 [(𝜖′T 𝜖TH)(𝜖 H𝜖)] = 1 NED,𝜖 [𝜖′T𝜖 𝜖′TH𝜖 𝜖TH𝜖 + 𝜖THH𝜖] = 1 NED,𝜖 [𝜖′T𝜖 𝜖′TH𝜖 𝜖TH𝜖 + 𝜖TH𝜖] = 1 NED,𝜖 [𝜖′T𝜖 + 𝜖TH𝜖] = 1 N ( k=1NE D𝜖k2 + i=1N j=1NE D[𝜖ihij𝜖j]) = σ2 (1 + d + 1 N )

Where we have used the fact that 𝜖 and 𝜖 are independent of each other and each 𝜖k and 𝜖k are independent among themselves.

User profile picture
2021-12-07 22:13
Comments