Exercise 9.5

Answers

We compute the distances between x1,x2 with xtest, we have

d12 = |x 1 xtest|2 = d(a i + 1)2,d 22 = |x 2 xtest|2 = 4 + d(b i + 1)2.

Suppose there are k + 1s in ai and l + 1s in bi, then we have

d12 = d(a i + 1)2 = 4k,d 22 = 4 + 4l = 4(l + 1).

To correctly classify xtest we want to have d1 < d2, which indicates that k < l + 1, i.e. k l.

So we need compute the probabilities of the number of + 1 in a is less than or equal to the number of + 1s in b. Both a and b have d elements, by symmetry the probability of P(k > l) = P(l < k), i.e. the probability of a having more + 1 than b is equal to the probability of a having less + 1 than b. So we have

2P(k > l) + P(k = l) = 1

, thus P(k l) = P(k < l) + P(k = l) = 1 2(1 + P(k = l)).

We only have to solve the probability P(k = l).

For a given k, the probability of

P [(a has k + 1) (b has k + 1)] = (d k) 2d (d k) 2d = ((d k))2 22d .

So we have

P(k = l) = k=0dP [(a has k + 1) (b has k + 1)] = k=0d (d k)2 22d = 1 22d k=0d(d k)2 = 1 22d( 2d d) = 1 22d (2d)! d!d! apply stirling approximation n! = 2πn (n e )n 1 22d 2π2d (2d e ) 2d 2πd (d e ) d2πd (d e ) d = 1 πd

So

P(k l) = 1 2(1 + P(k = l)) = 1 2 + 1 2 1 πd = 1 2 + O( 1 d).

This is the probability of classifying xtest correctly with two data points.

If there’s a third data point x3, then to correctly classify the xtest, we need have both d1 < d2 and d1 < d3, so the probability

P = P(k l)P(k l) = 1 4 + O( 1 d).

The probability of correctly classifying the xtest drop about half.

User profile picture
2021-12-08 10:22
Comments