Homepage Solution manuals Kevin P. Murphy Machine Learning: a Probabilistic Perspective Exercise 8.3 - Gradient and Hessian of log-likelihood for logistic regression

Exercise 8.3 - Gradient and Hessian of log-likelihood for logistic regression

Answers

For question (a),

d d a σ ( a ) = exp ( a ) ( 1 + exp ( a ) ) 2 = 1 1 + e a e a 1 + e a = σ ( a ) ( 1 σ ( a ) ) .

For question (b),

g ( 𝐰 ) = 𝐰 NLL ( 𝐰 ) = n = 1 N 𝐰 [ y i log μ i + ( 1 y i ) log ( 1 μ i ) ] = n = 1 N y i 1 σ i σ i ( 1 σ i ) 𝐱 i + ( 1 y i ) 1 1 σ i σ ( 1 σ i ) 𝐱 i = n = 1 N ( σ ( 𝐰 T 𝐱 i ) y i ) 𝐱 i ,

where σ i = σ ( 𝐰 T 𝐱 i ) .

For question (c), the result is obvious. For an arbitrary vector 𝐮 :

𝐮 T 𝐇 𝐮 = ( 𝐗 𝐮 ) T 𝐒 ( 𝐗 𝐮 ) = 𝐯 T 𝐒 𝐯 = d = 1 D v d 2 μ d ( 1 μ i ) 0 .

Hence 𝐇 is positive definite if all μ i are within ( 0 , 1 ) , otherwise it is semi-positive definite.

User profile picture
2021-03-24 13:42
Comments