15-381/681 Homework 4
Part I: Resolution in Propositional Logic
Download and the file logic-files.zip and extract its contents.
It contains a copy of utils.py that you will be familiar with from Homework 1, plus a
new file logic2.py that contains code for manipulating formulae in propositional and
predicate logic, and a file wumpus.py that we'll get to later.
Skim the contents of logic2.py so you understand the basics of what it provides.
Try the following expressions by doing "python -i logic2.py" and
then typing the expressions into the Python read-eval-print
Q1. (4 pts) A literal should never appear twice in a clause. For
example, the expression to_cnf('p | q | p') returns (p | q | p),
but this is equivalent to (p | q). Modify
the dissociate function to ensure uniqueness of
all terms. (Look in utils.py for help with this.)
Q2. (4 pts) If any argument to a conjunction is the
constant TRUE, it should be removed from the argument list. If
any argument is FALSE, the entire conjunction should be replaced
by FALSE. Modify the
- a = expr('p>>q')
- b = to_cnf(a)
Resolution establishes the validity of a sentence by assuming
its negation and deriving a contradiction:
- kb = PropKB()
- pl_resolution(kb, 'p | q | ~p')
Some sentences are neither always true nor always false. If
a sentence is not valid (always true), resolution will not be
able to derive a contradiction, so it fails and returns False.
- pl_resolution(kb, a)
- pl_resolution(kb, ~a)
Resolution requires the sentence to be expressed in CNF.
Note that a sentence and its negation can have very different
representations in CNF:
- c = expr('(p<=>q) <=> (q<=>p)')
This version of the resolution algorithm is not very clever. To
prove sentence c above requires 1949 calls to the pl_resolve function:
- pl_resolution(kb, c)
Examine the expressions created during the resolution procedure
and you will see several problems that you can fix to create a
smarter resolution theorem prover.
associate function to
accomplish this. Make the corresponding modification for
disjunctions. Note that FALSE and TRUE are instances defined in
logic2.py; they are not the Python constants False and True.
Q3. (4 pts) If a conjunction's (or disjunction's)
arguments include both a literal and its negation, such as p and
~p, the entire conjunction (disjunction) should be replaced by
the constant FALSE (or TRUE, respectively). Modify
associate function to enforce this.
Q4. (2 pts) If your modifications are correct, to_cnf(c),
where c is the biconditional expression defined above, should
return TRUE. What does to_cnf(~c) return?
Q5. (4 pts) The modifications you made
affect the resolution algorithm. You should find that
pl_resolution(kb, c) now finds the proof after 35 calls to
pl_resolve instead of 1949. But a look at the actual
resolutions shows there is still a lot of redundancy. This is
because on each round the algorithm tries to resolve every
clause ever derived with every other clause ever derived,
yielding N*(N-1)/2 resolutions. We can do better. Hint: going
forward, you may want to modify the print statement in the inner
loop to only show pl_resolve calls where the result list was
Modify the pl_resolution function to maintain a set of all
derived literals. A literal is either a proposition like p or
its negation, ~p. The initial set of derived literals comes
from the clauses supplied as input. For example, if asked to
prove 'p>>q' we would convert this to CNF as ~p|q and then
negate it to get p&~q, which yields two clauses, both
literals: p and ~q. At the beginning of each round of
resolution you should print the set of literals derived so
Q6. (30 pts) If we've already derived a literal p, then
any clause of form (p | q) is useless for resolution because we
know the clause is true even if q is false. If the clause (p |
q) unifies with (~p | r) then we could just as easily unify p
with (~p | r). On the other hand, if the clause (p | q) unifies
with (~q | r) it will yield (p | r), which cannot lead us to
infer r because p is alread proven. This is why (p | q) is
The pl_resolution function maintains a list
clauses containing all the clauses derived
so far. Modify the algorithm so that if any clause contains a
literal we already know to be true (because it appears in the
derived literals set), that clause is removed
clauses before we form pairs for
resolution. (See cautionary note below about removing things
from lists we're iterating over.) You will also need to revise
the termination condition for the algorith since the subset test
will no longer suffice. Since we're removing things from
clauses, in order to prevent the algorithm from
generating the same clauses repeatedly you will need to keep a
separate list of every clause ever generated, and only add new
clauses if they have not been
generated before. Your algorithm should terminate if a round
produces no new clauses. Verify that your theorem prover still
If you implement this refinement correctly, you should be able to
prove the biconditional expression c in roughly 9 calls to
Here are some additional expressions you could use as test cases. Only
one of these is valid:
d = expr('(p>>q>>r) <=> (r<<q<<p)')
e = expr('((p>>q)>>r) <=> (r<<(q<<p))')
Q7. (20 pts) We saw in lecture that if we derive a new
sentence in round i, that sentence must involve at
least one clause that was derived in round i-1,
because otherwise the new sentence would have been derived
pl_resolution to maintain a list
of clauses newly-derived in the current round. In order to tell
if a clause is newly-derived, you must keep track of all clauses
ever derived, even if those clauses have been removed from the
list of resolution candidates by the code you wrote for Q6. In
this new version of
pl_resolution, when you choose
pairs of clauses to resolve, at least one must come from the
Q8. (10 pts) The file wumpus.py contains an
axiomatization of a tiny bit of the wumpus world, and a call to
the theorem prover to prove that square [1,2] does not contain a
pit, i.e., ~P12. Running this file with the original version of
logic2.py takes 3139 calls to pl_resolve before a contradiction
is found, and produces 106 clauses. Running it on your modified
version should take far fewer calls. In our implementation it
takes 654, with just 38 clauses in the last round. But what
happens if we ask the theorem prover to prove P12 instead of
~P12? Since this conclusion is not valid, the only way for the
resolution algorithm to return False is by exhausting all
possible derivations without finding a contradiction. But the
number of combinations is enormous, and the combinatorial
explosion kills us. Most of these sentences are useless for
reaching the kinds of conclusions we care about in the wumpus
world, namely, Pij or ~Pij.
Invent a wumpus-specific heuristic for filtering useless clauses
from the list. (Hint: are clauses containing more than one
"breezy" literal of any use?) Modify pl_resolution to take an
optional third argument
wump that defaults to
False. When this argument is True it should apply your wumpus
filtering heuristic to all newly-derived clauses; when False it
should skip the heuristic and just do normal resolution.
With this wumpus heuristic, asking pl_resolution to prove P12
should fail in under 1000 calls to pl_resolve. In our
implementation it takes under 500. Asking it to prove ~P12
finds a contradiction in 435 calls instead of 645.
Q9. (4 pts) Write a
pl_nature(expression) that takes an
expression as input and returns one of the following:
"tautology" if the expression is always true, "contradiction" if
the expression is always false, or "meh" if the expression is
neither a tautology nor a contradiction. Your function should
make use of the pl_resolution function.
A note about removing items from a list while iterating over the list:
in Python this doesn't work. Try the following example:
You will note that a still has some 1 elements in it. This is because
remove destructively modifies the list and therefore messes up the
iterator set up by the for loop. To avoid this error you need to
iterate over a copy of the list, not the original. You can make
a copy by writing +a.
a = [1,1,2,1,1,1,3,4,1,1,5,1,6]
for x in a:
if x == 1:
Part II: Predicate Logic
Q10. (12 pts) Convert each of the following sentences
into predicate logic using reasonably named predicates,
functions, and constants. If you feel a sentence is ambiguous,
clarify which meaning you're representing in logic.
Q11. (6 pts) What is the most general unifier θ of
the following two literals?
R(x, y, x, f(y))
R(f(z), z, f(g(2)), w)
- Someone in Pittsburgh likes pierogies.
- Everyone who likes ham also likes cheese.
- Mary is tall and Bill is not.
- All of the houses near Sue's house are either large or old (but not both).
- Between any pair of real numbers there is at least one real number.
- There exists a dog that is cuter than all other dogs.
Create a zip archive containing exactly these files: (1) your
modified logic2.py file, (2) an answers.pdf file containing your
answers to the written questions Q4, Q10, and Q11, and (3) a
README.txt file with your name and Andrew id. Note: do not use
other archive formats such as tar, rar, or bz2; you must submit
a zip file.
Submit your zip file via Autolab.