Proofsketch of Theorem 5

Next: Proof of Lemma 1 Up: Appendix A Previous: Proof of Theorem 4

Proofsketch of Theorem 5

The reduction is made from the -Hard problem 3SAT5 [Feige1996]. This is the classical 3SAT problem [Garey Johnson1979], but each variable appears in exactly 5 clauses. Using a well-known reduction [Garey Johnson1979], page 55, with an additional simple gadget, we can make a reduction from 3SAT5 to vertex cover (thus, independent set), obtaining a graph in which all vertices have degree either 5, or 0, and for which the largest independent set (for satisfiable instances of 3SAT5) has size $\vert V\vert/2$ , where $\vert V\vert$ is the number of vertices of . From this particular graph, we build a simple reduction to our problem of maximizing . Note that since we are searching for an oblivious hypothesis, the observations are not important (we can suppose that all examples have the same observation). That is why the reduction only builds class vectors (over $\vert V\vert$ classes), encoding the class membership of any of these identical observations. The idea is that the classes are in one-to-one mapping with the vertices, and there are two sets of class vectors built from :

a set with $\vert V\vert$ vectors, encoding the vertices of . Each one is a class vector with only one ``1'' corresponding to the vertex, and the remaining components are zeroes. Each of the corresponding examples have weight .
a set with $\vert E\vert$ vectors, where $\vert E\vert$ is the number of edges of . Each one encodes an edge, and therefore contains two ``1'' (and the remaining are zeroes) corresponding to the two vertices of the edge. Each of the corresponding examples have weight .

Consider formulas (2), (3) for example. They are the sum of the contribution to

of the examples having weight

, and the examples having weight

. In these cases, we can rewrite

using the generic expression:

$\displaystyle Z$	$\textstyle =$	$\displaystyle Z_v + Z_e \:\:,$	(23)
$\displaystyle Z_v$	$\textstyle =$	$\displaystyle W_v \left( e^{-\alpha} k (\vert V\vert-k) + k(k-1) + (\vert V\vert-k)(\vert V\vert-k-1) + e^{\alpha} k (\vert V\vert-k) \right) \:\:,$	(24)
$\displaystyle Z_e$	$\textstyle =$	$\displaystyle W_e \left( e^{-\alpha} (\vert V\vert-k) (2C+U) + e^{\alpha} k (2M+U) \right) \:\:.$	(25)

Here,

is the number of edges having their two vertices in the set corresponding to the

values in $\vec{v}_i$ ,

is the number of edges having their two vertices in the set corresponding to the

values in $\vec{v}_i$ , and

is the number of edges having one of their vertices in the

set, and the other one in the

set.

is the number of

values in $\vec{v}_i$ .
Suppose that $W_v\gg W_e$ (e.g. $W_v > \vert V\vert^3 W_e$ ). Then the maximization of

is the maximization of

, followed by the maximization of

admits a maximum for $k=\vert V\vert/2$ , and with this value for

, it can be shown that maximizing

boils down to maximize

, that is, the (weighted) number of edges not falling entirely into the set corresponding to the

values; whenever the 3SAT5 instance is satisfiable (and using the particular degrees of the vertices), this set corresponds to the largest independent set of

Next: Proof of Lemma 1 Up: Appendix A Previous: Proof of Theorem 4