Probabilistic and On-Line Methods in Machine Learning
Thesis proposal by Adam Kalai
Date: Friday, April 14
Time: 1:00-2:00
Location: Wean 5409
Committee: Avrim Blum, Manuel Blum, Danny Sleator, and Santosh Vempala
Food will be served!
Abstract
The proposed thesis will include four problems in machine learning and
on-line algorithms. The talk will cover algorithms for two of these
problems, which are both "averaging" algorithms. First, we describe and
analyze k-fold cross-validation, a popular machine-learning technique for
estimating the error of a learned hypothesis. Viewed as an averaging
algorithm, we argue that it is at least as good an estimate as that
obtained from an independent held-out test set. Second, we look at a
particular investment strategy, Cover's Universal Portfolios, which has
on-line guarantees. Also viewed as an averaging algorithm, the analysis
becomes simple and the algorithm is easily extended to the case of
transaction costs. Interestingly, the analyses in the proposed thesis are
probabilistic, even though many of the algorithms are completely deterministic.