Constraint Satisfaction Problems (CSPs) appear in many real-life applications such as scheduling, resource allocation, timetabling, vehicle routing, frequency allocation, etc. Most CSPs can be naturally and efficiently modelled using non-binary (or n-ary) constraints that may involve an arbitrary number of variables. It is well known that any non-binary CSP can be converted into an equivalent binary one. The most well-known translations are the dual encoding [Dechter PearlDechter Pearl1989] and the hidden variable encoding [Rossi, Petrie, DharRossi et al.1990]. The ability to translate any non-binary CSP into binary has been often used in the past as a justification for restricting attention to binary CSPs. Implicitly, the assumption had been that when faced with a non-binary CSP we can simply convert it into a binary one, and then apply well-known generic techniques for solving the binary equivalent. In this paper we will show that this assumption is flawed because generic techniques for binary CSPs are not suitable for binary encodings of non-binary problems.

In the past few years, there have been theoretical and empirical studies on the efficiency of binary encodings and comparisons between binary encodings and the non-binary representation [Bacchus van BeekBacchus van Beek1998,Stergiou WalshStergiou Walsh1999,Mamoulis StergiouMamoulis Stergiou2001,SmithSmith2002,Bacchus, Chen, van Beek, WalshBacchus et al.2002]. Theoretical results have showed that converting non-binary CSPs into binary equivalents is a potentially efficient way to solve certain classes of non-binary problems. However, in the (limited) empirical studies there are very few cases where this appears to be true, with Conway's ``game of Life'' [SmithSmith2002] being the most notable exception. There are various reasons for this. In many cases, the extensive space requirements of the binary encodings make them infeasible. Also, in many non-binary problems we can utilize efficient specialized propagators for certain constraints, such as the algorithm developed by Régin Regin94 for the all-different constraint. Converting such constraints into binary is clearly impractical. Another reason, which has been overlooked, is that most (if not all) experimental studies use well-known generic local consistency and search algorithms in the encodings. In this way they fail to exploit the structure of the constraints in the encodings, ending up with inefficient algorithms. To make the binary encodings a realistic choice of modelling and solving non-binary CSPs, we need algorithms that can utilize their structural properties. Finally, it is important to point out that the use of a binary encoding does not necessarily mean that we have to convert all the non-binary constraints in a problem into binary, as it is commonly perceived. If we are selective in the constraints we encode, based on properties such as arity and tightness, then we can get efficient hybrid models.

To address these issues, we show that the use of specialized arc consistency and search algorithms for binary encodings of non-binary CSPs can lead to efficient models. We consider three encodings; the dual, the hidden variable, and the double encoding. The latter, which is basically the conjunction of the other two encodings, has received little attention but may well turn out to be the most significant in practice. The aim of this study is twofold. First, to present efficient algorithms for the binary encodings and analyze them theoretically and experimentally. Second, and more importantly, to investigate if and when the use of such algorithms can help solve non-binary problems efficiently. Towards these aims, we make the following contributions:

- We describe a simple algorithm that enforces arc consistency
on the hidden variable encoding of an arbitrary non-binary CSP
with
time complexity, where is the number of
constraints, the maximum arity of the constraints, and the
maximum domain size. This gives an improvement compared to
the asymptotic complexity of a generic arc consistency algorithm.
The improved complexity is now the same as the complexity of an
optimal generalized arc consistency algorithm in the non-binary
representation of the problem. We also identify a property of the
arc consistency algorithm for the hidden variable encoding that
can make it run faster, on arc inconsistent problems, than the
generalized arc consistency algorithm.
- We then consider search algorithms that maintain local
consistencies during search in the hidden variable encoding. We
show that, like maintaining arc consistency, all the
generalizations of forward checking to non-binary CSPs can be
emulated by corresponding forward checking algorithms that run in
the hidden variable encoding and only instantiate original
variables (i.e. the variables of the initial non-binary problem).
We show that each such algorithm and its corresponding algorithm
for non-binary constraints have the following relationships: 1)
they visit the same number of search tree nodes, and 2) the
asymptotic cost of each of them is within a polynomial bound of
the other.
- We describe a specialized algorithm for the dual encoding
that achieves arc consistency with worst-case time
complexity. This is significantly lower than the
complexity of a generic arc consistency algorithm. The
improvement in the complexity bound stems from the observation
that constraints in the dual encoding have a specific structure;
namely they are piecewise functional [Van Hentenryck, Deville, TengVan Hentenryck
et al.1992]. Apart from
applying arc consistency in the dual encoding of a non-binary CSP,
this algorithm can also be used as a specialized filtering
algorithm for certain classes of non-binary constraints.
- We adapt various search algorithms to run in the double
encoding and compare them theoretically to similar algorithms for
the hidden variable encoding and the non-binary representation.
Search algorithms that operate in the double encoding can exploit
the advantages of both the hidden variable and the dual encoding.
For example, we show that, under certain conditions, the
asymptotic cost of the maintaining arc consistency algorithm in
the double encoding can only be polynomially worse than the
asymptotic cost of the corresponding algorithm in the non-binary
representation (and the hidden variable encoding), while it can be
exponentially better.
- Finally, we make an extensive empirical study on various domains. We consider random problems as well as structured ones, like crossword puzzle generation, configuration, and frequency assignment. This study consists of two parts. In the first part, we give experimental results that demonstrate the advantages of the specialized algorithms for binary encodings compared to generic algorithms. For example, the specialized arc consistency algorithm for the dual encoding can be orders of magnitude faster than a generic arc consistency algorithm. In the second part we show that the use of binary encodings can offer significant benefits when solving certain classes of non-binary CSPs. For example, solving the dual encoding of some configuration problems can be orders of magnitudes more efficient than solving the non-binary representation. Also, empirical results from frequency assignment - like problems demonstrate that a binary encoding can be beneficial even for non-binary constraints that are intentionally specified.

This paper is structured as follows. In Section 2 we give the necessary definitions and background. In Section 3 we describe a specialized arc consistency algorithm for the hidden variable encoding. We also demonstrate that all the extensions of forward checking to non-binary CSPs can be emulated by binary forward checking algorithms that run in the hidden variable encoding. In Section 4 we explain how the complexity of arc consistency in the dual encoding can be improved and describe a specialized arc consistency algorithm. Section 5 discusses algorithms for the double encoding. In Section 6 we present experimental results on random and structured problems that demonstrate the usefulness of the proposed algorithms. We also draw some conclusions regarding the applicability of the encodings, based on theoretical and experimental results. Section 7 discusses related work. Finally, in Section 8 we conclude.

Nikolaos Samaras 2005-11-09