Optimization for machine learning


To minimize/maximize a function $F$, there are a few choices:

Gradient descent:

Conjugate gradient:


Newton's method:


Limited-memory BFGS

Reading