Exercise 7.2 - Multi-output linear regression

Answers

For multi-output linear regression, if the outputs are independent, then for each output dimension subscripted by j , we have:

p ( y j | 𝐱 , 𝐰 j ) = 𝒩 ( y j | 𝐱 T 𝐰 j , σ j 2 ) ,

then:

p ( 𝐲 | 𝐱 , 𝐖 ) = j = 1 M 𝒩 ( y j | 𝐱 T 𝐰 j , σ j 2 ) .

The independence implies:

p ( 𝐘 | 𝐗 , 𝐖 ) = n = 1 N j = 1 M 𝒩 ( y n , j | 𝐱 n T 𝐰 j , σ j 2 ) .

Taking its logarithm and saving terms dependent on 𝐖 , we have:

L ( 𝐖 ) = n = 1 N j = 1 M 1 σ j 2 ( y n , j 𝐱 n T 𝐰 j ) 2 .

Interchanging the order of summation helps to decompose the loss into:

j = 1 M L j ( 𝐰 j ) = j = 1 M 1 σ j 2 n = 1 N ( y n , j 𝐱 n T 𝐰 j ) 2 ,

where each:

L j ( 𝐰 j ) = 1 σ j 2 n = 1 N ( y n , j 𝐱 n T 𝐰 j ) 2 ,

is but the MLE loss for a 1D linear regression. Thus the columns of 𝐖 can be estimated independently by:

𝐰 j MLE = ( 𝐗 𝐗 T ) 1 𝐗 𝐘 j ,

where 𝐗 is the D N design matrix and 𝐘 is a columm matrix with length N that embeds the j -th component of the output. This is a little different from the symbols from the textbook but simple α -reductions would eliminate such difference. Equation (7.90) is incorrect by missing one design matrix.

As a compact way of writing 𝐖 MLE , we would have:

𝐖 MLE = ( 𝐗 𝐗 T ) 1 𝐗 𝐘 ,

where 𝐘 is a N M matrix.

For the case in this exercise, we have D = 2 , N = 6 , M = 2 :

𝐗 = ( 1 1 1 0 0 0 0 0 0 1 1 1 ) ,

𝐘 = ( 1 1 1 2 2 1 1 1 1 2 2 1 ) .

Thus:

𝐖 = ( 4 3 4 3 4 3 4 3 ) .

One can observe that two columns for 𝐖 are identical, which is obvious by examing the two columns from 𝐘 .

User profile picture
2021-03-24 13:42
Comments