􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊖􀊠􀊠􀊠􀊘􀊘􀊘􀊘􀊖􀊖
MIT 6.7220/15.084 — Nonlinear Optimization Thu, Feb 15th 2024
Lecture 4A
Feasibility, optimization, and separation
Instructor: Prof. Gabriele Farina ( gfarina@mit.edu)
These notes are class material that has not undergone formal peer review. The TAs and I are grateful for any
reports of typos.
In Lecture 1, we discussed how nonlinear optimization problems can be generally computationally
intractable. In Lecture 3, we introduced a class of functions—called convex functions—for which
first-order optimality conditions are both necessary and sufficient (assuming a convex feasible set
Ω). In this lecture, we continue our study of convex optimization, showing that in many cases the
solution to a convex optimization problem
min
𝑥
s.t.
𝑓(𝑥) convex
𝑥 ∈ Ω convex
can be found in polynomial time. However, it is also important to keep in mind that not all convex
optimization problems can be solved in polynomial time. For example, optimization over the copos-
itive cone (a convex set) is known to be intractable; we will talk more about that in a later class.
1 Separating a point from a closed convex set
An important property of any convex set Ω is that whenever a point 𝑦 is not in Ω, then we can sep-
arate 𝑦 from Ω using a hyperplane. In other words, flat separation surfaces are enough for certifying
that a point 𝑦 ∉ Ω.
Theorem 1.1. Let Ω ⊆ 𝑛 be a nonempty, closed, and convex set, and let 𝑦 ∈ 𝑛 be a point. If
𝑦 ∉ Ω, then there exist 𝑢 ∈ 𝑛, 𝑣 ∈ such that
⟨𝑢, 𝑦⟩ < 𝑣, and ⟨𝑢, 𝑥⟩ ≥ 𝑣 ∀𝑥 ∈ Ω.
Proof . The proof of the result rests on a very simple idea: the
direction of the halfspace will be made orthogonal to the line
that connects 𝑦 to its projection 𝑥∗ onto Ω, and the halfspace
boundary will be set so that it passes through 𝑥∗. We now
make the argument formal.
First, since Ω is nonempty and closed, a Euclidean projection
𝑥 of 𝑦 onto Ω exists,¹ as we discussed in Lecture 1. In other
words, the nonlinear optimization problem
min
𝑥
s.t.
1
2 ‖𝑥 − 𝑦‖2
2
𝑥 ∈ Ω
𝑢
𝑦
𝑥
Ω
must have at least a solution 𝑥 Ω. Furthermore, since the objective function is differentiable
and Ω is convex, from the first-order optimality conditions (see Lecture 2) we know that
⟨𝑥∗ − 𝑦, 𝑥 − 𝑥∗⟩ ≥ 0 ∀𝑥 ∈ Ω. (1)
Let now
𝑢 ≔ 𝑥∗ − 𝑦, [▷ this is the direction that connects 𝑦 to 𝑥∗]
and 𝑣 ≔ ⟨𝑢, 𝑥∗⟩. [▷ so that the halfspace boundary passes through 𝑥∗]
Note that 𝑢 ≠ 0, since 𝑥 Ω but 𝑦 ∉ Ω. So, ‖𝑢‖ > 0 and therefore
⟨𝑢, 𝑦⟩ = ⟨𝑢, 𝑥∗ − 𝑢⟩ = 𝑣 − ‖𝑢‖2
2 < 𝑣.
Thus, to complete the proof, we now need to show that ⟨𝑢, 𝑥⟩ ≥ 𝑣 for all 𝑥 ∈ Ω. But this is exactly
what (1) guarantees, since 𝑢 = 𝑥 𝑦 and 𝑣 = ⟨𝑢, 𝑥∗⟩ by definition.
¹In fact, it is easy to prove that the projection is unique (see Homework 1, Exercise 4). However, we do not need
uniqueness for the argument that follows.
The result above might not seem like much. After all, the proof is pretty straightforward, and the
geometric intuition strong enough that one might be tempted to just take it for granted. Instead, the
consequences of the result are deep, far-reaching, and intimately tied to some of the most significant
breakthroughs in mathematical optimization theory.
1.1 Separation oracles
The result established in Theorem 1.1 justifies the following definition.
Definition 1.1 ((Strong) separation oracle). Let Ω ⊆ 𝑛 be convex and compact. A strong sep-
aration oracle for Ω is an algorithm that, given any point 𝑦 ∈ 𝑛, correctly outputs one of the
following:
• “𝑦 ∈ Ω”, or
• “(𝑦 ∉ Ω, 𝑢, 𝑣)”, where the pair (𝑢, 𝑣) ∈ 𝑛 × is such that
⟨𝑢, 𝑦⟩ < 𝑣, and ⟨𝑢, 𝑥⟩ ≥ 𝑣 ∀𝑥 ∈ Ω.
1.2 Finding separating hyperplanes in practice
Theorem 1.1 guarantees the existence of a separating hyperplane. In many problems of interest,
constructing a separation oracle is simple.
Example 1.1 (Separation oracle for a convex polytope). Let Ω be a convex polytope, that is,
the convex set defined by the intersection of a finite number of halfspaces (linear inequalities)
Ω ≔ {𝑥 ∈ ℝ𝑛 : 𝐴𝑥 ≤ 𝑏}, where 𝐴 =

⎛— 𝑎
1

— 𝑎
𝑚 —⎠
⎞ ∈ ℝ𝑚×𝑛, 𝑏 ∈ ℝ𝑚.
Then, given a point 𝑦 ∈ 𝑚, we can implement a separation oracle as follows:
• if 𝐴𝑦 ≤ 𝑏, return “𝑦 ∈ Ω”;
• else, at least one of the inequalities 𝑎
𝑗 𝑦 ≤ 𝑏𝑗, 𝑗 ∈ {1, ..., 𝑚} is violated. In other words,
there exists 𝑗 such that 𝑎
𝑗 𝑦 > 𝑏𝑗, while by definition of Ω, 𝑎
𝑗 𝑥 ≤ 𝑏𝑗 for all 𝑥. This shows
that the response “(𝑦 ∉ Ω, −𝑎𝑗, −𝑏𝑗)” is a valid response.
Remark 1.1. Example 1.1 shows that whenever we have a finite number 𝑚 of inequalities, a
separation oracle for the polytope defined by those inequalities can be implemented in time
that depends linearly on 𝑚 and the dimension of the embedding space. This result establishes
a blanket guarantee, but in some cases, one can do better: depending on the structure of the
inequalities, sometimes one can get away with sublinear complexity in 𝑚. In some cases, one
might be able to construct an efficient separation oracle even for polytopes that have an infinite
number of inequalities!
We proceed with another classic example of a feasible set that admits a simple separation oracle.
Example 1.2 (Separation oracle for the semidefinite cone). Let Ω = {𝑀 ∈ 𝑛×𝑛 : 𝑀 ≽ 0} be the
set of semidefinite matrices, that is, all symmetric matrices such that 𝑣𝑀 𝑣 ≥ 0 for all 𝑣 ∈ 𝑛
—or, equivalently, such that all of 𝑀 ’s eigenvalues are nonnegative.
Then, given a point 𝑌 ∈ 𝑛×𝑛, we can implement a separation oracle as follows:
• if 𝑌 is not symmetric, then there exist 𝑖, 𝑗 ∈ {1, ..., 𝑛} such that 𝑌𝑖𝑗 < 𝑌𝑗𝑖; return “(𝑌 ∉
Ω, 𝐸𝑖𝑗 − 𝐸𝑗𝑖, 0)”, where 𝐸𝑖𝑗 is the matrix of all zeros, except in position 𝑖, 𝑗 where it has
a 1.
• else, if 𝑌 is symmetric, we can compute all of its eigenvalues and eigenvectors. If one
eigenvalue is negative, then the corresponding eigenvector 𝑤 must be such that 𝑤𝑌 𝑤 =
⟨𝑌 , 𝑤𝑤⊤⟩ < 0. Hence, return “(𝑌 ∉ Ω, 𝑤𝑤⊤, 0)”.
• otherwise, return “𝑌 ∈ Ω”.
As we show next, a fundamental result in optimization theory reveals that under mild hypotheses,
if the feasible set admits an efficient separation oracle and the objective function is convex, then the
solution can be computed efficiently.
2 Optimization via separation
In a major breakthrough in mathematical optimization, Khachiyan, L. G. [Kha80] proposed a poly-
nomial-time algorithm for using separation oracles to find the minimum of a linear function. The
algorithm, which goes under the name of ellipsoid method is actually more general, and applies to
general convex objectives on sets for which separation oracles are available. The result builds on top
of previous work by Šor, N. Z. [Šor77] and Yudin, D. B., & Nemirovskii, A. S. [YN76].
In particular, Khachiyan’s result was the first to show that linear programming problems can be
solved in polynomial time. This was an unexpected result at the time, and in fact, the complexity of
linear programming solvers was conjectured to be not polynomial (more on this in the next section).
The result of Khachiyan stirred so much enthusiasm in the research community that the New York
Times even advertised it on its first page.
Despite the enthusiasm, the ellipsoid method turned out to be very impractical. Still, it is a great
theoretical idea, and its consequences are pervasive.
Figure 1: https://timesmachine.nytimes.com/timesmachine/1979/11/07/issue.html
2.1 The intuition behind the ellipsoid method
Formalizing the details of the ellipsoid method is rather complex. A major source of difficulty is
the fact that the algorithm needs to approximate square roots using fractions to be implementable
on a finite-precision machine, and that causes all sorts of tricky analyses that the approximation
error can indeed be kept under control. These details are certainly important, but are notoriously
tedious, and fundamentally they are just that, details. If you are curious to read a formal account,
I recommend the authoritative book by Grötschel, M., Lovász, L., & Schrijver, A. [GLS93]. For this
lecture, we just focus on the idea behind the ellipsoid method.
The idea behind the ellipsoid method is rather elegant. At its core, it is a generalization of binary
search from one dimension to multiple dimensions. At every iteration of the algorithm, the space is
“cut” by using a separating hyperplane.
Feasibility. To build intuition, ignore for now the objective function, and consider the following
problem: given a separation oracle for Ω (closed and convex), either find 𝑥 ∈ Ω, or determine that
Ω is empty. You are given two radiuses:
• the radius 𝑅 > 0 guarantees that if Ω is not empty, then Ω ∩ 𝔹𝑅(0) ≠ ∅;
• the radius 𝑟 > 0 guarantees that if Ω is not empty, then it contains a ball of radius 𝑟 in its
interior.
If this problem were one-dimensional, then Ω would be either empty or an interval, and a separation
oracle would be an algorithm that, given any 𝑦 ∈ ℝ, would return whether 𝑦 ∈ Ω, or one of the
statements “𝑦 is too small” / “𝑦 is too large”. Solving the problem now appears easy: start from the
interval [−𝑅, 𝑅], and perform a binary search using the separation oracle to guide the search. Once
the size of the search interval drops below 𝑟, we know that Ω is empty.
The ellipsoid method generalizes this idea to multiple dimensions. At every iteration, it keeps track
of a “search space” (the generalization of the search interval above). Then, it queries the separation
oracle for the center of this search space. If the point does not belong to Ω, and the separation oracle
returns the separating hyperplane ⟨𝑢, 𝑥⟩ = 𝑣, then the search space is cut by considering now only
the subset of the search space that intersects {𝑥 ∈ 𝑛 : ⟨𝑢, 𝑥⟩ ≥ 𝑣}. The process continues until the
volume of the search space becomes smaller than the radius 𝑟. The reason why this method is called
the “ellipsoid method” is that the search space in the multi-dimensional case is not kept in the form
of an interval, but rather as an ellipsoid. This is mostly for computational reasons, since we need to
have an internal way of representing the search domain that is convenient to use.
Incorporating the objective. The above idea can be extended to incorporate an objective function
𝑓(𝑥). To do that, we will need to start cutting not only the search ellipsoid, but also the feasible set
to make sure we end up at the optimum. In other words, you can think of this extended ellipsoid
method as having “two modes”: while it has not found a feasible point in Ω, it cuts the search
ellipsoid; then, once feasible points are found, it cuts the feasible set to exclude all values above the
current value.
• Initialize at time 𝑡 = 1 with the starting point 𝑦1 0 ∈ ℝ𝑛, starting ellipsoid 1 𝔹𝑅(0), and
starting feasible set Ω1 Ω.
• At each time 𝑡, we ask a separation oracle for Ω𝑡 whether the center 𝑐𝑡 𝑛 of the search
ellipsoid 𝑡 belongs to Ω𝑡 or not.² There are only two cases:
If the center 𝑐𝑡 is not feasible, then set Ω𝑡+1 ≔ Ω𝑡, and cut the search space by setting
𝑡+1 to an ellipsoid that contains the intersection between 𝑡 and the halfspace containing
Ω𝑡 returned by the separation oracle.
If the center 𝑐𝑡 is feasible, then we know for sure that all points 𝑥 ∈ Ω𝑡 such that
⟨∇𝑓(𝑐𝑡), 𝑥 − 𝑐𝑡⟩ ≥ 0 are such that 𝑓(𝑥) ≥ 𝑓(𝑐𝑡). This follows trivially from the linear lower
bound property of convex functions (Theorem 1.1 of Lecture 3):
⟨∇𝑓(𝑐𝑡), 𝑥 − 𝑐𝑡⟩ ≥ 0 𝑓(𝑥) ≥ 𝑓(𝑐𝑡) + ⟨∇𝑓(𝑐𝑡), 𝑥 − 𝑐𝑡⟩ ≥ 𝑓(𝑐𝑡).
Hence, we can cut both the search ellipsoid 𝑡 and the feasible set Ω𝑡 by considering their
intersection with the halfspace 𝐻𝑡 {𝑥 ∈ 𝑛 : ⟨∇𝑓(𝑐𝑡), 𝑥 − 𝑐𝑡⟩ ≤ 0}. In particular, we set
Ω𝑡+1 ≔ Ω𝑡 ∩ 𝐻𝑡, and set 𝑡+1 to a smaller ellipsoid that contains 𝑡 𝐻𝑡.
Finally, after the volume of the search ellipsoid has gotten sufficiently small (this happens
after 𝑇 = 𝑂(𝑛2) log(𝑅/𝑟) iterations), we output the following:
– If we never encountered a center 𝑐𝑡 that was feasible, then we report that Ω was
infeasible.
– Else, we output the 𝑐𝑡 that minimizes 𝑓, out of those that were feasible.
²There is a caveat here: technically, we are assuming as given a separation oracle for Ω, not Ω𝑡. Yet, because Ω𝑡 is
obtained from Ω by intersecting with halfspaces, it is easy to see that one separation oracle for Ω𝑡 can be constructed
efficiently starting from that for Ω and the description of the intersected hyperplanes. Try working out the details!
Assuming that we can ignore all sorts of tedious rounding issues, the following guarantee can be
shown [Gup20].
Theorem 2.1. Let 𝑅 and 𝑟 be as above, and let the range of the function 𝑓 on Ω be bounded
by [−𝐵, 𝐵]. Then, the ellipsoid method described above run for 𝑇 ≥ 2𝑛2 log(𝑅/𝑟) steps either
correctly reports that Ω = , or produces a point 𝑥∗ such that
𝑓(𝑥∗) ≤ 𝑓 (𝑥) +
2𝐵𝑅
𝑟 exp(− 𝑇
2𝑛(𝑛 + 1)) ∀𝑥 ∈ Ω.
2.2 Takeaway message: Separation implies optimization
If you squint your eyes, what the ellipsoid method proves constructively is the following: if we know
how to construct a separation oracle for a set Ω, then we can optimize over Ω. Of course, this is
a bit of a simplification (and there are all sorts of little conditions here and there as we have seen
above), but nonetheless it is a good first approximation of the general message.
The opposite direction is also known to be true, even when by “optimization” we simply mean
optimization of linear objective functions.
Further readings and bibliography
If you want to read more about the ellipsoid method, the book by Grötschel, M., Lovász, L., &
Schrijver, A. [GLS93] is a standard and accessible reference on the topic. The bound on the approx-
imation error incurred by the ellipsoid method was taken from Gupta, A. [Gup20].
[Kha80] L. G. Khachiyan, “Polynomial algorithms in linear programming,” USSR Computational
Mathematics and Mathematical Physics, vol. 20, no. 1, pp. 53–72, 1980.
[Šor77] N. Z. Šor, “Cut-off method with space extension in convex programming problems,” Cy-
bernetics, vol. 13, no. 1, pp. 94–96, 1977.
[YN76] D. B. Yudin and A. S. Nemirovskii, “Informational complexity and efficient methods for
the solution of convex extremal problems,” Matekon, vol. 13, no. 2, pp. 22–45, 1976.
[GLS93] M. Grötschel, L. Lovász, and A. Schrijver, Geometric Algorithms and Combinatorial Op-
timization. Berlin, Germany: Springer, 1993. [Online]. Available: https://link.springer.
com/book/10.1007/978-3-642-78240-4
[Gup20] A. Gupta, “The Centroid and Ellipsoid Algorithms.” [Online]. Available: https://www.cs.
cmu.edu/~15850/notes/lec21.pdf

Content

PDF File

Typo or question?

Get in touch!
gfarina AT mit.edu

Metadata

Course: MIT 6.7220 / 15.084
Term: Spring 2024
Date: 2024-02-15