Homepage › Solution manuals › Yaser Abu-Mostafa › Learning from Data › Exercise 6.7
Exercise 6.7
Answers
- (a)
- If
is not training set consistent, and if
is a point which is not training set consistent, so we have .
According to CNN heuristic, we can find the nearest data point to
which is not in
and has class .
We need to prove that such a point exists.
We first fix the number of nearest neighbours to , and consider the neighbourhood of in and in , both and have the same number of points, i.e. .
We say and are not equal to each other, otherwise, won’t be an inconsistent point. So there must be some point that is in but not in .
Now we say that among these points that not in , there must be a point such that .
If no such point exists, then all the points in but not in will have the class . Suppose the intersection of and has points, then points in will have class , when it combines with the intersection of points, we will definitely find that the class of in to be , because is composed of points plus the same points, but gives a class of without requiring the points to be all in class .
This is the contradiction, so we proved that the CNN heuristic will surely find a point that is not in and has class
- (b)
- Since the set is a subset of , so for any of the nearest neighbors to from , i.e. , if it belongs to , it must belong to the neighbord from as well, i.e. . So the new nearest neighbors added to , will become neighbors of in .
- (c)
- Since there are total points, from problem (a), each step will add at most 1 point into the original points in if is still not training set consistent, and also the points is the set of , which automatically satisfies the training set consistency. So after at most iterations, the CNN heuristic must terminate with a training set consistent .