Exercise 7.11

Answers

Take derivative w.r.t. wij(l) in the second term of Eaug(w,λ), we have its derivative equals to

λ N 2wij(l) (1 + (wij(l))2) 2.

This proves the equation.

We use the ratio of gradient versus weight to check the rate of decay.

From the derivative, we check the ratio of the second term to the weight wij(l), and we have 2λ N 1 (1+(wij(l))2 )2 , which achieves maximum value of 1 when wij(l) 0. So the smaller the weight, the larger the decay w.r.t. itself.

This indicates that small weights decay much faster than large ones.

User profile picture
2021-12-08 09:55
Comments