Problems that we encounter in real-world applications are often NP-hard. Therefore, a wealth of algorithms have been developed to provide approximate solutions efficiently. These algorithms are often characterized by real-valued parameters and hence it is crucial to decide which algorithm from a parametrized family of infinitely many algorithms is the “best” to use. While conventionally, one would then choose the algorithm with the best worst-case performance, the worst-case scenario may never materialize in a specific application domain. This calls for an application-specific algorithm selection approach in which we tune the parameters of the algorithm to suit an unknown distribution over problem instances defining which problems are typical in that application. While this question has been studied empirically over the past several decades, there has been little work in developing a firm theoretical understanding until recently when Gupta and Roughgarden [2016] proposed a learning-theoretic model where one “learns” the best algorithm for an application by sampling problem instances from the underlying distribution. We work in this framework and focus on two widely used classes of algorithms: i) an infinite family of clustering algorithms which involve an agglomerative step followed by dynamic programming ii) a rich class of rounding techniques for SDP relaxations of integer quadratic programs such as max-cut. We provide tight bounds on the learning-theoretic dimensions of these classes, based on which we present simple and efficient techniques with sample complexity guarantees to learn the best algorithm for a given application domain. Based on joint work with Nina Balcan, Ellen Vitercik, and Colin White.