Homepage › Solution manuals › Kevin P. Murphy › Machine Learning: a Probabilistic Perspective › Exercise 13.11 - Projected gradient descent for l1 regularized least squares
Exercise 13.11 - Projected gradient descent for l1 regularized least squares
Answers
Generally, we take the gradient of and optimize. When some constraint on is broken by the gradient descent, the increment is moderated so that the constraint remains valid. For the following loss function:
consider under a linear regression context:
For which is not differentiate, it is suggest:
where
With , we have:
Hence the original problem is translated into:
Denote:
then we can rewrite the original target to be optimized into:
where:
Now we have changed the problem to a quadratic problem with a simple bound constraint. The gradient is given by:
An ordinary gradient descent step is:
For projected case, take :
And:
hence is constrained as a legal weight candidate.
The original paper suggest more delicate method to moderate the learning rate, refer to Gradient Projection for Sparse Reconstruction: Application to Compressed Sensing and Other Inverse Problems, Mario A.T.Figueiredo.