Homepage › Solution manuals › Yaser Abu-Mostafa › Learning from Data › Exercise 9.4

Exercise 9.4

Answers

(a)

variance (x_{1}) = variance ({\hat{x}}_{1}) = 1

variance (x_{2}) = variance (\sqrt{1 - 𝜖^{2}} {\hat{x}}_{1} + 𝜖 {\hat{x}}_{2}) = (1 - 𝜖^{2}) variance ({\hat{x}}_{1}) + 𝜖^{2} variance ({\hat{x}}_{2}) = 1

$covariance (x_{1}, x_{2}) = E [(x_{1} - {\bar{x}}_{1}) (x_{2} - {\bar{x}}_{2})] = E [x_{1} x_{2}] = E [\sqrt{1 - 𝜖^{2}} {\hat{x}}_{1}^{2} + 𝜖 {\hat{x}}_{1} {\hat{x}}_{2}] = \sqrt{1 - 𝜖^{2}}$

(b)

\begin{array}{l} f (x) & = w_{1} x_{1} + w_{2} x_{2} \\ = w_{1} {\hat{x}}_{1} + w_{2} (\sqrt{1 - 𝜖^{2}} {\hat{x}}_{1} + 𝜖 {\hat{x}}_{2}) \\ = (w_{1} + w_{2} \sqrt{1 - 𝜖^{2}}) {\hat{x}}_{1} + w_{2} 𝜖 {\hat{x}}_{2} \\ = ŵ_{1} {\hat{x}}_{1} + ŵ_{2} {\hat{x}}_{2} \end{array}

So if we set $ŵ_{1} = w_{1} + w_{2} \sqrt{1 - 𝜖^{2}}, ŵ_{2} = w_{2} 𝜖$ , we see $f$ is linear in $x_{1}, x_{2}$ .

(c)

From problem (b), we have

ŵ_{1} = ŵ_{2} = 1

, so we have

w_{1} = \frac{𝜖 - \sqrt{1 - 𝜖^{2}}}{𝜖}, w_{2} = \frac{1}{𝜖}

, so that

C \geq w_{1}^{2} + w_{2}^{2} = 2 \frac{1 - 𝜖 \sqrt{1 - 𝜖^{2}}}{𝜖^{2}}

(d)

𝜖 \to 0

, we have the minimum

C \to \infty

. It means that we have to use a huge

C

to be able to implement the target function, which is impossible here.

(e)

If there is significant noise in the data, with correlated inputs, it’ll be hard to regularize the learning, and overfitting is likely. So var term can be high while bias can be low.

niuers

2021-12-08 10:21

Exercise 9.4

Answers

Comments

Add answer