Exercise 13.1 - Partial derivative of the RSS

Answers

For question (a), define:

RSS ( 𝐰 ) = n = 1 N ( y n 𝐰 T 𝐱 n ) 2 .

Then we have straightforwardly:

w j RSS ( 𝐰 ) = n = 1 N 2 ( y n 𝐰 T 𝐱 n ) ( x 𝑛𝑗 ) = n = 1 N 2 ( x 𝑛𝑗 y n x 𝑛𝑗 i = 1 D w i x 𝑛𝑖 ) = n = 1 N 2 ( x 𝑛𝑗 y n x 𝑛𝑗 i j D w i x 𝑛𝑖 x 𝑛𝑗 2 w j ) .

From which we observe that w j ’s coefficient is:

a j = 2 n = 1 N x 𝑛𝑗 2 ,

while the rest irrelevent terms can be absorbed into:

c j = 2 n = 1 N x 𝑛𝑗 ( y n 𝐰 j T 𝐱 n , j ) .

The optimal value for w j is:

ŵ j = c j a j .

For question (b), (13.184) is obvious by plugging the definition of 𝐫 k into (13.182)-(13.183) and the expression for ŵ j .

User profile picture
2021-03-24 13:42
Comments