Exercise 4.16 - Likelihood ratio for Gaussians

Answers

Consider a classifier for two classes, the generative distribution for them are two normal distributions p ( x | y = C i ) = 𝒩 ( x | μ i , Σ i ) , by the Bayes rule:

log p ( y = 1 | x ) p ( y = 0 | x ) = log p ( x | y = 1 ) p ( x | y = 0 ) + log p ( y = 1 ) p ( y = 0 ) .

The first term on r.h.s. is the ratio of likelihood probability.

When we have arbitrary covariance matrices:

p ( x | y = 1 ) p ( x | y = 0 ) = | Σ 0 | | Σ 1 | exp { 1 2 ( x μ 1 ) T Σ 1 1 ( x μ 1 ) + 1 2 ( x μ 0 ) T Σ 0 1 ( x μ 0 ) } .

As Σ 0 , Σ 1 are arbitrary matrices, this formulation cannot be reduced further:

log p ( x | y = 1 ) p ( x | y = 0 ) = 1 2 log | Σ 0 | | Σ 1 | 1 2 ( x μ 1 ) T Σ 1 1 ( x μ 1 ) + 1 2 ( x μ 0 ) T Σ 0 1 ( x μ 0 ) .

Note that the decision boundary ( log p ( x | y = 1 ) p ( x | y = 0 ) = 0 ) is a quardratic surface in D -dimension space.

When both covariance matrixes are given by Σ :

p ( x | y = 1 ) p ( x | y = 0 ) = exp { 1 2 ( x μ 1 ) T Σ 1 ( x μ 1 ) + 1 2 ( x μ 0 ) T Σ 1 ( x μ 0 ) } ,

so:

log p ( x | y = 1 ) p ( x | y = 0 ) = 1 2 ( x μ 1 ) T Σ 1 ( x μ 1 ) + 1 2 ( x μ 0 ) T Σ 1 ( x μ 0 ) = 1 2 tr ( Σ 1 [ ( x μ 1 ) ( x μ 1 ) T ( x μ 2 ) ( x μ 2 ) T ] ) .

When Σ is a diagonal matrix, we have:

log p ( x | y = 1 ) p ( x | y = 0 ) = 1 2 ( x μ 1 ) T Σ 1 ( x μ 1 ) + 1 2 ( x μ 0 ) T Σ 1 ( x μ 0 ) = 1 2 tr ( Σ 1 [ ( x μ 1 ) ( x μ 1 ) T ( x μ 2 ) ( x μ 2 ) T ] ) = 1 2 tr ( Λ 1 Φ ) = 1 2 i = 1 d λ i 1 Φ i , i .

where:

Φ = ( x μ 1 ) ( x μ 1 ) T ( x μ 2 ) ( x μ 2 ) T .

Finally, if Σ = σ 2 I then:

log p ( x | y = 1 ) p ( x | y = 0 ) = 1 2 σ 2 tr ( Φ ) .

Note that for the last three cases, a decision boundary is a linear plane in the space, since the quadratic term on x has been canceled in Φ .

User profile picture
2021-03-24 13:42
Comments