\(
\def\naturals{\mathbb{N}}
\def\integers{\mathbb{Z}}
\def\rationals{\mathbb{Q}}
\def\reals{\mathbb{R}}
\)
| Due date: |
5.00 PM(noon), 12th Dec 2015 |
Real Analysis
I.
- Let $x \in \rationals$, $x > 0$.
- If $x^2 < 2$ show that there exists $n \in \naturals$ such that
\[
\left(x+\frac{1}{n}\right)^2 < 2
\].
- If $x^2 > 2$ show that there exists $n \in \naturals$ such that
\[
\left(x-\frac{1}{n}\right)^2 > 2
\]
You will need the Archimedean property for the above problems.
- Prove that
\[
\left(R \left(\cos(x) + i \sin(x)\right)\right)^n = R^n \left(\cos (nx) + i \sin(nx)\right)
\]
where $R \in \reals$ and $i = \sqrt(-1)$. Use mathematical induction.
II.
- Let $X$ be an open set. Show that
- If a finite number of points are removed from $X$, the remaining set is still open.
- This does not remain true if a countable number of poins is removed from $A$. You may show this using a counter example.
- Let $A, B \subset \reals^n$. $A$ and $B$ are both compact sets. Show that $A \cup B$ is also compact. Use the fact that for compact sets every cover has a finite subcover.
III.
-
A sequence $\{x_n\}$ is said to diverge to (plus) infinity if, for any $c > 0$, there exists an $N$ such that for all $n > N$ $x_n > c$. We state without proof that for all $z > 0$, the sequence $x_n = (1+z)^n$ diverges to infinity.
Using this fact, prove that for any $y > 1$ the sequence $x_n = \sqrt[n]{y}$ converges to 1.
- Show that for any $z$ such that $0 < z < 1$, the sequence $x_n = (1-z)^n$ converges to 0.
- Using the above result, show that for all $y$ such that $0 < y < 1$, the sequence $x_n = \sqrt[n]{y}$ converges to 1.
IV.
- Let $\{x_n\}$ be any sequence of integers such that $x_n \neq x_{n+1}$ for any $n \in \naturals$. Show that
- $\{x_n\}$ is not a Cauchy sequence.
- $\{x_n\}$ can have Cauchy subsequences.
- Let $\{x_n\}$ be any sequence of positive numbers. Show that
\[\lim\inf_{n\rightarrow\infty} \frac{x_{n+1}}{x_n} ~\leq~ \lim\inf_{n\rightarrow\infty} \sqrt[n]{x_n} ~\leq~ \lim\sup_{n\rightarrow\infty} \frac{x_{n+1}}{x_n}
\]
You may use the result $\forall x > 0$, as $n \rightarrow \infty$, $\sqrt[n]{x} \rightarrow 1$ here.
V.
- Find $a$ such that
\[
f(x) =
\begin{cases}
x^2-4x & {\rm if}~~x < a \\
-4 & {\rm if}~~x \geq a
\end{cases}
\]
is continuous everywhere.
-
Two continuous functions $f,g : \reals \rightarrow \reals$ have the property that for any two distinct points $x_1, x_2 \in \reals$ such that $x_1 < x_2$, there exists a third point $x_3$ such that $x_1 < x_3 < x_2$ and $f(x_3) = g(x_3)$. Show that $f(x) = g(x)$ for all $x$.
HINT: You may find it useful to construct a sequence $y_n = x - \frac{1}{n}$, and $x_n$ as any number such that $y_n \lt x_n \lt x$, and consider the value of $(f-g)(x_n)$. You will have to apply the definition of continuity to arrive at your final result.
VI.
- Find the value of $a$ such that the following value exists.
\[
\lim_{x \rightarrow -1} \frac{2x^2 - ax - 14}{x^2 -2x -3}
\]
Find the value of the above expression at that limit. You may have to dig into your college math for this solution, since we didn't explicitly cover the required concepts in class.
- Let $f: [0,~\infty) \rightarrow \reals$ be a continuous function defined on $[0,~\infty)$ such that $\lim_{x\rightarrow \infty}f(x) = C$ for some finite $C$. Show that $f$ is bounded. Recall that $\lim_{x\rightarrow \infty}f(x) = C$ implies that for every $\epsilon > 0$, there exists a value $X$ such that for all $x > X$, $|f(x) - C| < \epsilon$.
Probability
I.
- The Gates building has two elevators. The two are independent of each other and perfectly random. At any time, each of them is equally likely to be at any floor. The building has 9 floors. You arrive at the elevator lobby in the 7th floor with the intention of travelling down to the 2nd floor. What is the probability that the next elevator that arrives at the 7th floor will be travelling in the direction you want? How would this improve it the building had 3 elevators?
- You spend a lazy afternoon rolling a fair, six-sided dice. You decide, for some unfathomable reason, that you will only stop rolling the dice when you have obtained at least four fours and three threes. Give the probability distribution function for the number of times you must roll your dice before you can stop and go for dinner.
- All that dice rolling makes you thirsty, and you want to have beer. But your strange principles only allow you to have beer if you leave for dinner after 7pm. Given that you started rolling at 3pm, and each roll takes five minutes, what is the probability that you will have beer?
II.
- In a game show, a participant is shown three doors. Behind one is a car. Behind the other two are goats. The host knows which door has the car. The participant first selects a door. But before he opens it, the host opens one of the other two doors and shows that there is a goat behind that door. The participant is now given the option to switch his first choice. For example, if the doors are named $A$, $B$ and $C$, and the participant chose door $A$ and then the host opens $B$ to show a goat, the particpant is allowed to change his selection to $C$.
Show that the participant has a greater probability of finding the car if he switches his selection, after the host has opened a door to show a goat, i.e. show that $P(finding~car | switch) > P(finding~car | no~switch)$. What is this probability?
- Now, for the same problem, the host does not know which door has the car. The particpant chooses a door, and then the host randomly selects one of the other two doors. The open door turns out to have a goat behind it.
Show that in this condition, the participant gains no advantage by switching his selection.
III.
- $x$ and $y$ are two random variables. $E(y | x) = 0$. Show that $E(xy) = 0$.
- An urn contains three marbles, two black and one white. You and your friend play a game: each of you sticks your hand into the urn and draws a marble (without replacing it in the urn). Whoever draws the white marble gets $100. Using the law of iterated expectations, show that it does not matter who draws the first marble; each of you have the same expected winning.
IV.
$X$ is a Gaussian random variable with mean $\mu$ and variance $\sigma^2$. $Y$ is a derived random variable obtained from $X$ as $Y = \log(1 + \exp(X))$.
What is the probability density function $f_Y(y)$?
V.
$X$ and $Y$ are independent Gaussian random variables. Both $X$ and $Y$ have mean $0$ and variance $1$, i.e. both of them are standard normal RVs.
We define two new derived random variables, $U$ and $V$, as:
\[
U = \frac{1}{2}(X + Y) \\
V = \frac{1}{2}(X - Y)
\]
We define a second level derived random variable
\[
Z = UV
\]
What are the mean and variance of $Z$.
You may find it useful to know that the square of a standard normal random variable is a Chi-squared RV, and has mean 1 and variance 2.
VI.
In 1910 Rutherford, Geiger and Bateman ran a classic experiment on the decay of radioactive substances. They counted the number of alpha particles emitted by a film of polonium (using a technique invented by Geiger, which went on to become the basis for the famous Geiger counter). Counts were obtianed over intervals of one-eighth of a minute (7.5 seconds). The obtained the following results.
In the table below, the left-hand column gives the number of particles observed in a single time unit. The right-hand column is the number of distinct time units in which such an observation was made. For instance, the fourth row of the table indicates that in 525 separate measurements (each over one time unit of 7.5 seconds), only 3 particles were observed in that time interval.
| Number of Particles |
Number of intervals |
| 0 | 57 |
| 1 | 203 |
| 2 | 383 |
| 3 | 525 |
| 4 | 532 |
| 5 | 408 |
| 6 | 273 |
| 7 | 139 |
| 8 | 45 |
| 9 | 27 |
| 10 | 10 |
| 11 | 4 |
| 12 | 0 |
| 13 | 1 |
| 14 | 1 |
| 15 or more | 0 |
Rutherford inferred from these readings that the decay of radioactive materials was Poisson in nature.
Test Rutherford's hypothesis. To do so, first calculate the rate of emission $\lambda$ from the above readings. Using the computed $\lambda$ and the formula for the Poisson distribution, calculate the expected number $E[n]$ of intervals in which $n$ emissions are observed. Compare these with Rutherford's actual readings. Was Rutherford right?
VII.
John joins a coin-flip game at a casino, where players may bet on coin tosses. If they call the right face of the coin (heads or tails), they win as much money as they have bet. If they call it wrong, they lose their bet. John fancies himself a mathematical genius and comes up with the following strategy. At any time, if his net winnings at that point are 0 or greater, he will bet $1. If his net winnings so far are negative, he will bet as much money as his net loss. E.g., if after 10 games, he finds that he has lost a net of 3 dollars, he will bet 3 dollars.
John also plans to quit the game if he has either won 5 dollars or lost 3 dollars.
John has $10 in his pocket at the beginning of the game.
- What are his expected winnings?
- What is the expected number of games he will play before he quits?
VIII.
Gallup wants to conduct a poll to determine if the winning candidate in the next presidential election will be a Democrat or a Republican. So they poll a number of people to determine what fraction $f_{poll}$ of them would vote Republican. However, they would like to be sure that this fraction $f_{poll}$ is not too far from the actual fraction $f$ of the overall population who would vote Republican. In particular, they desire that there is a less than 5% probability that the estimated and true fractions differ by more than 5%, i.e. that $P(|f_{poll} - f| > 0.05) < 0.05$. What is the least number of people they must poll? Use the central limit theorem to obtain this number.
IX.
Your company has a magic classifier to analyze some market data that is correct 80% of the time. You invest all your spare time into improving it and finally come up with a classifier that you believe is better than the one being used by your company. You test 1000 instances and find that you classifier gets 850 of them right. But your boss is not convinced; he claims the difference could be the result of random variation in performance that would naturally see across different test sets.
- To convince your boss, you decide to test the $P$ value of your result, and show that that the result is significant at a $P=0.01$ level (i.e. that there is less than a probability of 0.01 of getting a result at least as good as yours randomly, with the company's default classifier). Work out the arithmetic to determine if your results of your test satisfy this criterion (i.e. if the probability of getting 850 or more right out of 1000 test instances is less than 0.01 using the company's default classifier).
- To further convince your boss, you decide to run a two-sided test. Your result classifies 50 more instances correctly than the expected number of successes for your company's default classifier on a test set of size 1000 (you would expect 800 correct classifications with your company's classifier). You want to show that the probability of getting a performance more than 50 instances away from the mean is also very small. Verify if the probability of obtaining a result at least as extreme as yours (e.g. the probability of getting 850 or more right, or 750 or fewer right) is less than 0.01.
You can treat the probability of correct classification as Bernoulli random variable to solve the above problems. In this case, the Bernoulli random variable representing your company's default classifier has $P(success) = 0.8$. You may also require the central limit theorem.
X.
To further convince your boss, you decide to run a paired test. You test your 1000 instances using both the company's default classifier and your own classifier. You objective is to compare the two classifications and show that yours is better on average.
You find that the default classifier and your new classifier differed in their predictions on 100 of the 1000 test instances. Your classifier performed better than the default classifier a net of 50 times (i.e. you got 75 instances right that the default classifier got wrong, and on 25 instances the default classifier was right, while yours was wrong).
Show that the probability of a result at least as skewed as this is less than 1%.
Remember that if your classifier only performed as well on average as the company's classifier (as your boss claims), you'd expect that on the instances on which your classifier and the company's classifier disagreed, the number of times you were right and the default classifier was wrong would be the same.