Homepage › Solution manuals › Yaser Abu-Mostafa › Learning from Data › Exercise 3.10
Exercise 3.10
Answers
(a) If , then the SGD algorithm will update the by: .
When , the derivative of when (when the sample is correctly classified) is zero, the derivative is when (i.e. when the sample is misclassified).
Take the derivatives into the SGD update equation, we see that’s exactly PLA.
(b) For logistic regression, we have . If is very large: * When , . * When , .
The above results are consistent with the values used in PLA.
This is another indication that the logistic regression weights can be used as a good approximation for classification.