next up previous
Next: Proof of Theorem 3 Up: Appendix A Previous: Appendix A

Proof of Theorem 1

Since the hardness results of Theorems 1 and 3 are stated for the two-classes case, we shall use the notation $\Delta_{(i)}=\vec{v}_{(i)}[1] - \vec{v}_{(i)}[0]$ for some arbitrary rule $(t_{(i)}, \vec{v}_{(i)})$, where $\vec{v}_{(i)}[0]$ is the value for class ``-'' and $\vec{v}_{(i)}[1]$ is the value for class ``+''. A positive value for $\Delta_{(i)}$ means that $t_{(i)}$ is in favor of class ``+'' whereas a negative value gives a $t_{(i)}$ in favor of class ``-''. Value $0$ for $\Delta_{(i)}$ gives a $t_{(i)}$ neutral with respect to the classes. We use a reduction from the $NP$-Hard problem ``Minimum Cover'' [Garey Johnson1979]:

The reduction is constructed as follows : from a ``Minimum Cover'' instance we build a learning sample $LS$ such that if there exists a cover of size $\vert C'\vert \leq K$ of $S$, then there exists a decision committee with $\vert C'\vert$ literals consistent with $LS$, and, reciprocally, if there exists a decision committee with $k$ literals consistent with $LS$, then there exists a cover of size $k$ of $S$. Hence, finding the smallest decision committee consistent with $LS$ is equivalent to finding the smallest $K$ for which there exists a solution to ``Minimum Cover'', and this is intractable if $P\neq NP$.
Let $c_j$ denote the $j^{th}$ element of $C$, and $s_j$ the $j^{th}$ element of $S$. We define a set of $\vert C\vert$ Boolean variables in one to one correspondence with the elements of $C$, which we use to describe the examples of $LS$. The corresponding set of literals is denoted $\{x_1, \overline{x}_1, x_2, \overline{x}_2, ..., x_{\vert C\vert},
\overline{x}_{\vert C\vert}\}$. The sample $LS$ contains two disjoint subsets : the set of positive examples $LS^+$, and the set of negative ones $LS^-$. $LS^+$ contains $\vert S\vert$ examples, denoted by $e^+_1,
e^+_2, ..., e^+_{\vert S\vert}$. We construct each positive example so that it encodes the membership of the corresponding element of $S$ in the elements of $C$. More precisely,
$\displaystyle \forall 1\leq i\leq \vert S\vert, e^+_i$ $\textstyle =$ $\displaystyle \left( \bigwedge_{j : s_i \in c_j} {x_j} \right) \wedge \left( \bigwedge_{j : s_i \not\in c_j} {\overline{x}_j} \right) \:\:.$ (9)

$LS^-$ contains a single negative example, defined by:
$\displaystyle e^-$ $\textstyle =$ $\displaystyle \bigwedge_{j=1}^{j=\vert C\vert} {\overline{x}_j} \:\:.$ (10)

$\bullet$ Suppose there exists a cover $C'$ of $S$ satisfying $\vert C'\vert \leq K$. We create a decision committee consisting of $K$ monomials, each with one literal only and associated to a positive $\Delta$. Each monomial codes one of the sets in $C'$. The default class is ``-''. This decision committee is consistent with the examples of $LS^+ \cup LS^-$, otherwise some element of $S$ would not be covered. If there are only two values authorized for the vectors and they are $\leq 0$, we simply create a DC consisting of one monomial with negative literals associated to a negative $\Delta$ (the value for the negative class is greater than the one of the positive class); each of the negative literals codes one of the sets in $C'$. The default class is ``+''.
$\bullet$ Suppose now that there exists a decision committee $f$ with at most $k$ literals consistent with $LS$. Denote $t_1, t_2, ..., t_{\vert f\vert}$ each monomial of $f$, in no specific order, and $\Delta_1, \Delta_2, ..., \Delta_{\vert f\vert}$ their associated values for $\Delta$. The monomials of $f$ can belong to three types of subsets of monomials: Let us call respectively $M$, $N$, $MN$ these three classes. Given that each monomial of $f$ can be associated to a positive or a negative $\Delta$, there exists on the whole six classes of rules, presented in Figure 5.

Figure 5: The six possible cases of rules.
\begin{figure}\begin{center}
\begin{tabular}{l\vert c\vert\vert c\vert\vert}
\m...
...$MN$\ \hspace{2cm} & $<0$\ \\
\cline{2-3}
\end{tabular}\end{center}\end{figure}

Any monomial of $f$ containing at least one positive literal can only be satisfied by positive examples. Therefore, if there exists rules belonging to class II or VI, we can remove them without losing consistency. Furthermore, since $e^-$ contains only negative literals, if we remove their negative literals from all rules belonging to class V (making them go to class I), we do not lose consistency. As a consequence, we can suppose without loss of generality that all rules of $f$ are in class I, III, or IV.
We now treat independently two cases, depending on whether the default class of $f$ is ``+'' or ``-''.

  1. The default class is ``-''. Any positive example satisfies therefore a monomial in $f$. There can exist two types of positive examples: those satisfying at least one rule of class I, and those not satisfying any class I rule (therefore satisfying at least one rule of class III). $e^-$ satisfies all class III and IV rules. Therefore,
    $\displaystyle \sum_{(t_i,\vec{v}_i) \in f \cap (\mbox{ class III } \cup \mbox{ class IV })} {\Delta_i} \leq 0 \:\:.$     (11)

    This shows that, if a positive example not satisfying any class I rules would satisfy all class IV rules, then it would be misclassified, which is impossible by the consistency hypothesis. This gives an important property, namely that any positive example not satisfying any class I rule cannot satisfy all class IV rules. Let us call P this property in what follows. We now show how to build a valid solution to ``Minimum Cover'' with at most $k$ elements. For any positive example $e^+_i$, Iterating the above procedure for all positive examples, we obtain a cover of $S$ consisting of at most $k$ subsets of $S$.
  2. The default class is ``+''. $e^-$ satisfies all class III and IV rules. Therefore,
    $\displaystyle \sum_{(t_i,\vec{v}_i) \in f \cap (\mbox{ class III } \cup \mbox{ class IV })} {\Delta_i} < 0 \:\:.$     (12)

    Even if the inequality is now strict, it gives the same procedure for efficiently building the solution to ``Minimum Cover'' with at most $k$ elements, by using the same argument as in the preceeding case.
This ends the proof of Theorem 1.


next up previous
Next: Proof of Theorem 3 Up: Appendix A Previous: Appendix A
2002 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved.