Homepage › Solution manuals › Yaser Abu-Mostafa › Learning from Data › Exercise 9.5

Exercise 9.5

Answers

We compute the distances between $x_{1}, x_{2}$ with $x_{test}$ , we have

d_{1}^{2} = | x_{1} - x_{test} |^{2} = \sum^{d} {(a_{i} + 1)}^{2}, d_{2}^{2} = | x_{2} - x_{test} |^{2} = 4 + \sum^{d} {(b_{i} + 1)}^{2} .

Suppose there are $k$ $+ 1$ s in $a_{i}$ and $l$ $+ 1$ s in $b_{i}$ , then we have

d_{1}^{2} = \sum^{d} {(a_{i} + 1)}^{2} = 4 k, d_{2}^{2} = 4 + 4 l = 4 (l + 1) .

To correctly classify $x_{test}$ we want to have $d_{1} < d_{2}$ , which indicates that $k < l + 1$ , i.e. $k \leq l$ .

So we need compute the probabilities of the number of $+ 1$ in $a$ is less than or equal to the number of $+ 1$ s in $b$ . Both $a$ and $b$ have $d$ elements, by symmetry the probability of $P (k > l) = P (l < k)$ , i.e. the probability of $a$ having more $+ 1$ than $b$ is equal to the probability of $a$ having less $+ 1$ than $b$ . So we have

2 P (k > l) + P (k = l) = 1

, thus $P (k \leq l) = P (k < l) + P (k = l) = \frac{1}{2} (1 + P (k = l)) .$

We only have to solve the probability $P (k = l)$ .

For a given $k$ , the probability of

P [(a has k + 1) \cap (b has k + 1)] = \frac{(\binom{d}{k})}{2^{d}} \frac{(\binom{d}{k})}{2^{d}} = \frac{{((\binom{d}{k}))}^{2}}{2^{2 d}} .

So we have

\begin{array}{l} P (k = l) & = \sum_{k = 0}^{d} P [(a has k + 1) \cap (b has k + 1)] \\ = \sum_{k = 0}^{d} \frac{{(\binom{d}{k})}^{2}}{2^{2 d}} \\ = \frac{1}{2^{2 d}} \sum_{k = 0}^{d} {(\binom{d}{k})}^{2} \\ = \frac{1}{2^{2 d}} (\binom{2 d}{d}) \\ = \frac{1}{2^{2 d}} \frac{(2 d)!}{d! d!} \\ apply stirling approximation n! = \sqrt{2 πn} {(\frac{n}{e})}^{n} \\ \approx \frac{1}{2^{2 d}} \frac{\sqrt{2 π 2 d} {(\frac{2 d}{e})}^{2 d}}{\sqrt{2 πd} {(\frac{d}{e})}^{d} \sqrt{2 πd} {(\frac{d}{e})}^{d}} \\ = \frac{1}{\sqrt{πd}} \end{array}

P (k \leq l) = \frac{1}{2} (1 + P (k = l)) = \frac{1}{2} + \frac{1}{2} \frac{1}{\sqrt{πd}} = \frac{1}{2} + O (\frac{1}{\sqrt{d}}) .

This is the probability of classifying $x_{test}$ correctly with two data points.

If there’s a third data point $x_{3}$ , then to correctly classify the $x_{test}$ , we need have both $d_{1} < d_{2}$ and $d_{1} < d_{3}$ , so the probability

P = P (k \leq l) P (k \leq l) = \frac{1}{4} + O (\frac{1}{\sqrt{d}}) .

The probability of correctly classifying the $x_{test}$ drop about half.

niuers

2021-12-08 10:22

Exercise 9.5

Answers

Comments

Add answer