\documentclass{report}
\usepackage{amsmath, amsthm, amssymb}
\usepackage{setspace, fancyhdr, extramarks, fullpage, lastpage}
\usepackage{fourier}
\usepackage{graphicx,float,wrapfig, algorithm2e}
\usepackage[all]{xy}
\usepackage{enumerate}
\usepackage{url}
\usepackage{proof}
% Homework Specific Information
\newcommand{\hmwkTitle}{Homework 2}
\newcommand{\hmwkAssignDate}{February 1}
\newcommand{\hmwkDueDate}{February 15}
\newcommand{\hmwkClass}{Graduate Artificial Intelligence 15-780}
\newcommand{\hmwkClassInstructor}{Geoff Gordon \and Tuomas Sandholm}
\newcommand{\hmwkDescr}{Tree Search and SAT solving}
%\newcommand{\Answer}[1]{\noindent\fbox{\begin{minipage}[c]{0.9\columnwidth}#1\end{minipage}}}
\newcommand{\Answer}[1]
{}
\newenvironment{penumerate}{
\begin{enumerate}
\setlength{\itemsep}{0pt}
\setlength{\parsep}{3pt}
\setlength{\topsep}{-8pt}
\setlength{\partopsep}{0pt}
\setlength{\leftmargin}{1.5em}
\setlength{\labelwidth}{1em}
\setlength{\labelsep}{0.5em}
\setlength{\parskip}{0pt}
\setlength{\parsep}{0pt}
}{\end{enumerate}}
\newenvironment{pitemize}{
\begin{itemize}
\setlength{\itemsep}{0pt}
\setlength{\parsep}{3pt}
\setlength{\topsep}{-8pt}
\setlength{\partopsep}{0pt}
\setlength{\leftmargin}{1.5em}
\setlength{\labelwidth}{1em}
\setlength{\labelsep}{0.5em}
\setlength{\parskip}{0pt}
\setlength{\parsep}{0pt}
}{\end{itemize}}
\renewcommand{\labelenumi}{\alph{enumi})}
% In case you need to adjust margins:
\topmargin=-0.45in %
\evensidemargin=0in %
\oddsidemargin=0in %
\textwidth=6.5in %
\textheight=9.0in %
\headsep=0.25in %
% Setup the header and footer
\pagestyle{fancy} % %
\lhead{\hmwkTitle} %
\rhead{\hmwkClass } % %
\cfoot{} %
\rfoot{Page\ \thepage\ of\ \protect\pageref{LastPage}} %
\renewcommand\headrulewidth{0.4pt} %
\renewcommand\footrulewidth{0.4pt} %
% This is used to trace down (pin point) problems
% in latexing a document:
%\tracingall
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Some tools
\newcommand{\enterProblemHeader}[1]{\nobreak\extramarks{#1}{#1 continued on next page\ldots}\nobreak%
\nobreak\extramarks{#1 (continued)}{#1 continued on next page\ldots}\nobreak}%
\newcommand{\exitProblemHeader}[1]{\nobreak\extramarks{#1 (continued)}{#1 continued on next page\ldots}\nobreak%
\nobreak\extramarks{#1}{}\nobreak}%
\newlength{\labelLength}
\newcommand{\labelAnswer}[2]
{\settowidth{\labelLength}{#1}%
\addtolength{\labelLength}{0.25in}%
\changetext{}{-\labelLength}{}{}{}%
\noindent\fbox{\begin{minipage}[c]{\columnwidth}#2\end{minipage}}%
\marginpar{\fbox{#1}}%
% We put the blank space above in order to make sure this
% \marginpar gets correctly placed.
\changetext{}{+\labelLength}{}{}{}}%
\newcommand{\homeworkProblemName}{}%
\newcounter{homeworkProblemCounter}%
\newenvironment{homeworkProblem}[1][Problem \arabic{homeworkProblemCounter}]%
{\stepcounter{homeworkProblemCounter}%
\renewcommand{\homeworkProblemName}{#1}%
\section*{\homeworkProblemName}%
\enterProblemHeader{\homeworkProblemName}}%
{\exitProblemHeader{\homeworkProblemName}}%
\newcommand{\problemLAnswer}[1]
{\labelAnswer{\homeworkProblemName}{#1}}
\newcommand{\homeworkSectionName}{}%
\newlength{\homeworkSectionLabelLength}{}%
\newenvironment{homeworkSection}[1]%
{% We put this space here to make sure we're not connected to the above.
% Otherwise the changetext can do funny things to the other margin
\renewcommand{\homeworkSectionName}{#1}%
\settowidth{\homeworkSectionLabelLength}{\homeworkSectionName}%
\addtolength{\homeworkSectionLabelLength}{0.25in}%
\changetext{}{-\homeworkSectionLabelLength}{}{}{}%
\subsection*{\homeworkSectionName}%
\enterProblemHeader{\homeworkProblemName\ [\homeworkSectionName]}}%
{\enterProblemHeader{\homeworkProblemName}%
% We put the blank space above in order to make sure this margin
% change doesn't happen too soon (otherwise \sectionAnswer's can
% get ugly about their \marginpar placement.
\changetext{}{+\homeworkSectionLabelLength}{}{}{}}%
\newcommand{\sectionAnswer}[1]
{% We put this space here to make sure we're disconnected from the previous
% passage
\noindent\fbox{\begin{minipage}[c]{\columnwidth}#1\end{minipage}}%
\enterProblemHeader{\homeworkProblemName}\exitProblemHeader{\homeworkProblemName}%
\marginpar{\fbox{\homeworkSectionName}}%
% We put the blank space above in order to make sure this
% \marginpar gets correctly placed.
}%
\begin{document}
\begin{spacing}{1.1}
\begin{titlepage}
\begin{center}
~
\vspace{2in}
\Large{\textmd{\textbf{\hmwkClass}}}
\vspace{1in}
\Huge{\hmwkTitle: \emph{\hmwkDescr}}
\vspace{1in}
\decofourright
\vspace{1in}
\large{Out\ on\ \hmwkAssignDate}\\
\large{Due\ on\ \hmwkDueDate}
\end{center}
\end{titlepage}
% ---- Problem 1 -----
\begin{homeworkProblem}[Problem 1: Proving with InstGen]\label{prob:instgen}
Consider the following formula:
\begin{equation}
\mathcal{F} = (\forall x, P(x) \vee Q(x)) \wedge (\neg P(a) \vee \neg P(b))
\end{equation}
Prove that $\mathbf{\mathcal{F}} \Rightarrow \exists x, Q(x)$ using the InstGen algorithm given in class (pp. 38-47 in the slides for January $18^\text{th}$). For each iteration $\{0,\ldots,n\}$, please provide:
\begin{pitemize}
\item The first-order formula $\mathcal{G}_i$;
\item The propositionalized formula $\mathcal{G}_i^{(Gr)}$;
\item A propositional model $M^{(Gr)}$ for $\mathcal{G}_i^{(Gr)}$;
\item The lifted model $M$ for $\mathcal{G}_i$;
\item The set of discordant pairs;
\item An application of InstGen on one of those discordant pairs that leads to $\mathcal{G}_{i+1}$
\end{pitemize}
Discuss how you formed $\mathcal{G}_0$, and why you are using it to prove $\mathcal{F} \Rightarrow \exists x, Q(x)$. For the final iteration $n$ report which of the algorithm's termination conditions you hit.
We found a proof that took 4 applications of InstGen, but your proof may be longer. The propositional formulas $\mathcal{G}_i^{(Gr)}$ should be simple enough to find models by hand.
%\vspace{10pt}
%Recall that InstGen is the following proof rule:
%\begin{equation}
%\infer{(A \vee \ell)\theta \qquad (\neg \ell' \vee B)\theta}{A \vee \ell' \qquad \neg \ell \vee B}
%\end{equation}
%Where $\theta$ is a variable substitution such that $\ell\theta = \ell'\theta$, $A$ and $B$ are the remainder of their clauses (could either be empty or some disjunction of literals) and $(\ell,\ell')$ are a discordant pair. This notation means that given the clauses on top of the line, you can conclude the clauses on the bottom.
\end{homeworkProblem}
% ---- Problem 2 -----
\begin{homeworkProblem}[Problem 2: Build-A-SAT-Solver Workshop]\label{prob:solver}
In this problem you need to build a boolean satisfiability solver that takes a set of variables and constraints in conjunctive normal form (CNF) and returns either a satisfying assignment or determines that no satisfying assignment is possible.
In particular, we are asking you to implement the DPLL algorithm---it is the basis for some of the world's fastest SAT solvers. Even though SAT is an $NP$-complete problem DPLL typically has good \emph{empirical} performance.
The input will be provided in a simplified DIMACS CNF format (see, for example, \\\url{http://www.satcompetition.org/2009/format-benchmarks2009.html}):
\begin{pitemize}
\item Each line that starts with a `\texttt{c}' is a comment;
\item The first non-comment line must be of the form: `$\mathtt{p\; cnf\; \left\; \left}$;
\item All other lines are space delimited lists of literals with a positive value indicating the variable and a negative value indicating its negation. For example, the number `\texttt{-4}' represents $\neg x_4$. Each line is terminated by a `$0$'. (Note: because of this don't call any variable $x_0$. Not only is this symbol reserved, but also $0 = -0$.)
For example, the formula $(x_1 \vee \neg x_2) \wedge (\neg x_1 \vee x_2 \vee \neg x_3)$ can be represented as:
\begin{verbatim}
c Some made-up problem
p cnf 3 2
1 -2 0
-1 2 -3 0
\end{verbatim}
\end{pitemize}
The solution should be a space delimited two column file with a row per variable. Each row should have the format $v a$, where $v$ is the variable number and $a$ is either $0$ or $1$ (false or true). For example, a solution file for the above problem might look like:
\vspace{10pt}
\hspace{10pt}\begin{minipage}{0.2\columnwidth}
\begin{verbatim}
1 0
2 0
3 0
\end{verbatim}
\end{minipage}
\vspace{10pt}
In this solution, each variable was set to false.
Some test SAT problems can be found on the homework page. You can use the provided Python model verifier to check your solutions. Feel free to compare against minisat (\url{http://minisat.se/}) or any other existing solver. Just make sure that you hand in your own code.
In your implementation of DPLL, consider including the following features:
\begin{pitemize}
\item Unit propagation;
\item Variable ordering heuristics;
\item Value ordering heuristics;
\item Handling special cases (\emph{e.g.} Horn clauses)
\end{pitemize}
Please do not include clause learning or or conflict-directed backjumping at this stage of your code. The solver that you build here will be extended in the next problem set. Design your SAT solver with this in mind---make sure that your implementation is robust and easily extensible.
We will not specify what language you should use, but you \textbf{must} provide two scripts to ease automated grading:
\begin{pitemize}
\item \textbf{build.sh}, a script that builds your project. For C++ this may be a bunch of \texttt{g++} calls, or a single make call. For a scripting language like Python this might be an empty file or a pleasant \texttt{echo} message for the grader.
\item \textbf{run.sh}, a script that takes in a single DIMACS CNF formatted file as input and prints either a solution or "UNSAT".
\end{pitemize}
In particular, your scripts should hook into a grading script that looks like:
\begin{verbatim}
${DIR}/build.sh
${DIR}/run.sh file1.cnf > ${DIR}/run_1.out
${DIR}/run.sh file2.cnf > ${DIR}/run_2.out
...
\end{verbatim}
\noindent{\large\textbf{How you will be graded:}}
You must provide the following files:
\begin{pitemize}
\item The commented code for your solver;
\item Both \textbf{build.sh} and \textbf{run.sh}.
\end{pitemize}
Do not include any binaries.
If your scripts do not conform to our naming standards, your code does not compile, or your code is a wrapper to an external SAT solver, \textbf{the grader will be unhappy and mark accordingly}. If the grader is in a good mood, they might provide limited `run-time' debugging, but \textbf{do not count on it}. Well commented code is always appreciated and is known to put graders in a good mood.
Your code will be first run against the provided test examples, then run against a reserved set of CNF problems. You will get full marks if your solver provides either a satisfying assignment or correctly identifies a formula as UNSAT within \textbf{two minutes}. (This is a generous cut-off and is per run.) Bonus marks will be awarded to the top three fastest solvers for each problem instance in the reserved set.
Code hand-in instructions will be sent out soon via email.
\end{homeworkProblem}
% ---- Problem 3 -----
\begin{homeworkProblem}[Problem 3: Getting Empirical On Your SAT]\label{prob:phase}
Write a random $3SAT$ generator. This generator should pass in two numbers: $c$ and $n$. Given these numbers the generator should generate $c$ clauses and write the result in the DIMACS format described in Problem 2. Each clause should have three literals selected uniformly from all possible literals of $n$ variables. (recall that there are $2n$ literals if there are $n$ variables).
Generate $25$ different random $3SAT$ instances for $n=25$ and $c = \left\{75, 85, \ldots, 175\right\}$. Generate a further $25$ different random $3SAT$ instances for $n=50$ and $c = \left\{150, 170, \ldots, 350\right\}$. This is a total of $550$ problem instances---start these experiments early. Run your SAT solver from Problem 2 on each instance, and record the time that it takes to solve it and whether it was satisfiable or not.
\begin{enumerate}
\item Plot the probability of $SAT$ vs. the ratio $\frac{c}{n}$ for both $n = 25$ and $n = 50$ (separately; do not aggregate them). What do you notice? Can you explain this behavior?
\item Plot a box-and-whiskers graph (see, for example, Matlab's \texttt{boxplot(X)} command) for the runtime of your solver vs. the ratio $\frac{c}{n}$ for both $n = 25$ and $n = 50$ (separately; do not aggregate them). What do you notice? Can you explain this behavior?
\end{enumerate}
\end{homeworkProblem}
% ---- Problem 4 -----
\begin{homeworkProblem}[Problem 4: Randomized Solvers]
Modify your solver from Problem 2 to branch randomly---branch by selecting a variable uniformly from the set of remaining variables (not branched on; not unit-propagated). Please copy/backup/branch your old solver first. Do \emph{not} hand this version in as your answer for Problem 2 \emph{unless} this is the fastest solver that you have found. You can also use a flag in the command line to change your solver's branching heuristic, but make sure that your \texttt{run.sh} file in Problem 2 has the correct flag set.
We will call your solver from Problem 2 $S_G$ and this randomized solver $S_R$.
\begin{enumerate}
\item For \texttt{problem10.cnf} in the provided files, run $S_R$ your random solver at least 25 times and plot the cumulative distribution function (CDF) of the runtimes. (More runs yield a smoother plot.) You can, for example, use the \texttt{cdfplot(X)} command in Matlab.
\item Suppose that you have run your $S_R$ on a particular problem instance many times. After plotting the CDF (as above) you notice that the CDF function of your solver's runtime looks almost exactly like \begin{equation} F(x) = 1 - \frac{1}{x + 1},\label{eq:cdf}\end{equation} where $F(x)$ is the probability that the run took less than $x$ seconds to complete. (Your CDF plot from part (a) will not necessarily look like this plot.)
How can you use this empirical observation to speed up your solver? Propose a modification to $S_R$, and simulate this modified solver's runtime using the idealized CDF given in Eq~\ref{eq:cdf}. \emph{I.e.}, write a program to sample from the distribution of runtimes for this modified solver, and plot the resulting CDF.
%How can you use this empirical observation to speed up your solver? Simulate this technique using the idealized CDF given in Eq~\ref{eq:cdf} and plot the CDF for $S_R$ after applying this technique. Would this technique help $S_R$ on \texttt{problem10.cnf}? Why or why not? Make specific reference to the CDF plot you generated in part (a).
\item Is $S_G$ better or worse than $S_R$ on \texttt{problem10.cnf} in the provided files? Think of at least three ways to compare the performance of two potentially random algorithms. What are the benefits and drawbacks of each method of comparison?
\end{enumerate}
\end{homeworkProblem}
\end{spacing}
\end{document}