Exercise 11.14 - EM for censored linear regression

Answers

The model for censored linear regression (non-Bayesian version) is:

𝜖 i 𝒩 ( 𝜖 | 0 , σ 2 ) , z i = 𝐰 T 𝐱 i + 𝜖 i , y i = min ( z i , c i ) .

The observed variables are y i , c i and 𝐱 i and we are to estimate 𝐰 and σ 2 . The latent variable in this model is z i . The complete likelihood is:

p ( y i , z i | c i , 𝐱 i , 𝐰 , σ 2 ) = 𝒩 ( y i | 𝐰 T 𝐱 i , σ 2 ) ,

if y i c i , and is:

p ( y i , z i | c i , 𝐱 i , 𝐰 , σ 2 ) = c i 𝒩 ( z | 𝐰 T 𝐱 i , σ 2 ) d z ,

if y i = c i , which implies z i c i . We observe that the integral is going to appear inside the logarithm operator, so it is better to approximate this value from its moments. One possible approximation is to use (11.137) and (11.138), so when y i = c i , the first and the second moment of z i are:

𝐰 T 𝐱 i + σ H ( c i 𝐰 T 𝐱 i σ ) , ( 𝐰 T 𝐱 i ) 2 + σ 2 + σ ( c i + 𝐰 T 𝐱 i ) H ( c i 𝐰 T 𝐱 i σ ) ,

respectively. Note that this is a variant version of the E-step.

The log likelihood for the entire dataset now becomes:

log p ( 𝐘 , 𝐙 | c i , 𝐗 , 𝐰 , σ 2 ) = y i c i log p ( y i , z i | c i , 𝐱 i , 𝐰 , σ 2 ) + y i = c i log p ( y i , z i | c i , 𝐱 i , 𝐰 , σ 2 ) = y i c i [ 1 2 log 2 π σ 2 ( y i 𝐰 T 𝐱 i ) 2 2 σ 2 ] + y i = c i [ 1 2 log 2 π σ i 2 ( y i 𝐰 T 𝐱 i σ H ( c i 𝐰 T 𝐱 i σ ) ) 2 2 σ i 2 ] ,

where σ i 2 = 𝔼 [ z i 2 | z i c i ] 𝔼 2 [ z i | z i c i ] .

Finally, in the M-step, we take the partial gradient of the log likelihood w.r.t. 𝐰 and σ 2 , set them to zero in order to update them. This step is straightforward given H ’s gradient can be automatically solved efficiently.

User profile picture
2021-03-24 13:42
Comments