Homepage › Solution manuals › Yaser Abu-Mostafa › Learning from Data › Exercise 7.14
Exercise 7.14
Answers
Take derivative of w.r.t. , we have when .
The weights that minimize is where . From the calculated , the gradient descent at moves in the same direction as , i.e. .
The direction should be the same as , we need normalize to get a unit direction and the size of the step is controlled by the step size. Otherwise, we may overshoot, even if we are moving in the right direction.