AI researchers often disagree about the best strategy to train a machine learning system, but there is one belief that is generally agreed upon: humans are still much better learners than machines. Unlike AI systems, humans do not learn difficult new tasks (e.g., solving differential equations) from scratch, by looking at independent and identically distributed examples. Instead, humans often follow sequences of steps that allow them to incrementally build up the necessary skills for performing these new tasks. Curriculum Learning (CL) is a line of work that tries to incorporate this human approach to learning into machine learning. In this thesis, we aim to discover the problem settings in which different forms of CL are beneficial, and the types of benefits they provide. We propose new CL methods and apply them to a variety of models and problem settings, from teaching an LSTM to solve basic arithmetic problems, to neural machine translation using Transformers, image classification using convolutional neural networks, and compositional multitask learning problems. Through these experiments, we observed that curriculum learning can be very beneficial in certain settings (e.g., on sequential data such as sentences) if well designed, but it can also harm the efficiency of learning if performed poorly (e.g., if the curriculum spends too much time on easy problems). Finally, we conduct analyses to understand why curriculum learning leads to the observed effects.
Tom Mitchell (Co-Chair)
Barnabás Póczos (Co-Chair)
Rich Caruana (Microsoft Research)
Zoom Participation. See announcement.