Exercise 7.3 - Centering and ridge regression

Answers

The loss is given by:

L ( 𝐰 , w 0 ) = ( 𝐲 𝐗 T 𝐰 w 0 1 ) T ( 𝐲 𝐗 T 𝐰 w 0 1 ) .

Taking partial gradient w.r.t. w 0 yields:

w 0 L = 1 T ( 𝐲 𝐗 T 𝐰 ) N .

Setting it to zero gives (7.94), where we have made use of x ¯ = 0 .

Taking partial gradient w.r.t. 𝐰 yields:

𝐰 MLE = ( 𝐗 𝐗 T + λ 𝐈 ) 1 𝐗 ( 𝐲 w 0 1 ) .

In the equation (7.95), the observed data has to be centralized.

User profile picture
2021-03-24 13:42
Comments