\section{Sets of Natural Numbers}
\label{sec:NatSet}

Class [[NatSet]] represents sets of natural numbers.  An instance of
[[NatSet]] is either a finite set or the complement of a finite set.
At present, there are two subclasses of [[NatSet]] that allow you to
pick different space and time cost characteristics.

We view sets as not being constrained by size limits.  One major client
for these set classes is the BVD library for data-flow analysis
\cite{bibbvd}.  In that application, it is useful to be able to begin
developing data-flow information before we know the maximum number of
bit-vector slots that we'll need to represent it.  Furthermore, some
applications of the BVD framework make incremental extensions of
existing data-flow information in which they increase the sizes of the
original bit vectors.

To create an instance of [[NatSet]], you must choose one of the
following subclasses.   They have different underlying representations
and therefore different performance characteristics.

\begin{tabular}{l@{\hspace*{2em}}p{.7\linewidth}}
[[NatSetDense]]	& A [[NatSet]] represented using a bit vector:
		  \begin{itemize}
		  \item Insertion and removal of elements takes place in
			constant time.
		  \item Storage is in general proportional to the value
			of the largest element of the finite-set part.
		  \item Iteration produces elements in
			increasing order, but not necessarily in time
			proportional to the set cardinality.
		  \item Union, intersection, subtraction, and equality
			testing can take time proportional to set
			storage, but they exploit bitwise parallel
			operations. 
		  \end{itemize} \\
[[NatSetSparse]] & A [[NatSet]] represented using a sorted linked list:
		  \begin{itemize}	
		  \item Insertion and removal of elements depends on the
			finite-set cardinality.
		  \item Storage is proportional to the cardinality of the
			finite-set part.
		  \item Iteration produces elements in increasing order
			and in time proportional to the number of
			elements produced.
		  \end{itemize}
\end{tabular}

We also declare an iterator type compatible with all of the
[[NatSet]] types.

<<class [[NatSetIterPure]]>>=
class NatSetIterPure {
  public:
    virtual ~NatSetIterPure() { }

    virtual unsigned current() const = 0;
    virtual bool is_valid() const = 0;
    virtual void next() = 0;

    virtual void insert(unsigned) = 0;
    virtual void remove(unsigned) = 0;
};
@

Its methods are the usual ones for SUIF iterators, except that it supports
insertion and removal of elements into/from the set being traversed.

\begin{tabular}{l@{\hspace*{2em}}p{.7\linewidth}}
[[current()]]		& Return the current element. \\
[[is_valid()]]		& Return [[true]] unless the iterator is exhausted. \\
[[next()]]		& Advance to the next element, if any. \\
[[insert(element)]]	& Inserts [[element]] in the set under iteration. \\
[[remove(element)]]	& Removes [[element]] from the set under iteration.
\end{tabular}

The [[insert]] and [[remove]] methods are provided both for convenience.
It is convenient to be able to modify a set via its iterator without having
to pass around both the set and the iterator.  It is often more efficient
to modify a sorted set at the point currently under the iterator's
attention.  However, the [[insert]] and [[remove]] methods have a
precondition to simplify implementations.  The element being inserted or
removed must not exceed the element currently under scan.


The iterator class used in actual programs is a pointer-based realization
of the interface class [[NatSetIterPure]]:

<<class [[NatSetIter]]>>=
class NatSetIter : public NatSetIterPure {
  public:
    NatSetIter(NatSet&);
    NatSetIter(const NatSetIter&);
    NatSetIter(NatSetIterRep *rep) : rep(rep) { }
    virtual ~NatSetIter();

    virtual NatSetIter& operator=(const NatSetIter&);

    virtual unsigned current() const;
    virtual bool is_valid() const;
    virtual void next();

    virtual void insert(unsigned);
    virtual void remove(unsigned);

  protected:
    NatSetIterRep *rep;    
};
@





\subsection{Class [[NatSet]]}

The [[NatSet]] class interface expresses all of the functionality of the
natural-set types except initial construction.  For that, see the
subclasses [[NatSetDense]] and [[NatSetSparse]] just below.

<<class [[NatSet]]>>=
class NatSet {
  public:
    virtual ~NatSet() { }
    virtual NatSet& operator=(const NatSet&);

    virtual bool is_finite() const = 0;
    virtual bool is_empty() const = 0;

    virtual int size() const = 0;

    virtual bool contains(unsigned element) const = 0;

    virtual void insert(unsigned element) = 0;
    virtual void remove(unsigned element) = 0;

    virtual void insert_all() = 0;
    virtual void remove_all() = 0;

    virtual void complement() = 0;
    
    virtual bool operator==(const NatSet&) const;
    virtual bool operator!=(const NatSet &that) const
	{ return !(*this == that); }

    virtual void operator+=(const NatSet&);
    virtual void operator*=(const NatSet&);
    virtual void operator-=(const NatSet&);

    virtual NatSetIter iter(bool complement = false) = 0;
    virtual NatSetIter iter(bool complement = false) const = 0;

    virtual void print(FILE* = stdout, unsigned bound = UINT_MAX) const;

<<[[NatSet]] extender's interface>>
};
@

Here's a rundown on the [[NatSet]] methods.

\begin{tabular}{l@{\hspace*{2em}}p{.7\linewidth}}
{\tt operator=($s$)}	& Copies $s$ into the current set, preserving
			  the representation style of the current set. \\
[[is_finite()]]		& Returns [[true]] if the current set is finite,
			  i.e., is not the complement of a finite
			  set. \\
[[is_empty()]]		& Returns [[true]] if the current set is empty. \\
[[size()]]		& Must only be applied when the current set [[is_finite]].
			  Returns the number of its elements. \\
[[contains(]]$e$[[)]]	& Returns [[true]] if $e$ is an element of the
			  current set. \\
[[insert(]]$e$[[)]]	& Inserts $e$ into the current set. \\
[[remove(]]$e$[[)]]	& Removes $e$ from the current set. \\
[[insert_all()]]	& Makes the current set represent the whole
			  universe (the complement of the empty set). \\
[[remove_all()]]	& Makes the current set empty. \\
[[complement()]]	& Replaces the current set by its complement. \\
[[operator==(]]$s$[[)]]	& Returns [[true]] if the current set equals $s$. \\
[[operator!=(]]$s$[[)]]	& Returns [[true]] if the current set does not
			  equal $s$. \\
[[operator+=(]]$s$[[)]]	& Unions set $s$ into the current set. \\
[[operator*=(]]$s$[[)]]	& Intersects set $s$ into the current set. \\
[[operator-=(]]$s$[[)]]	& Subtracts set $s$ from the current set. \\
[[iter()]]		& Produces an iterator (of type [[NatSetIter]])
			  over the current set.  This iterator doesn't
			  terminate unless the set is finite. \\
{\tt print($f$, $b$)}	& Prints the elements of the current set to file $f$,
			  excluding those greater than bound $b$.
\end{tabular}


\paragraph{Extender's view of [[NatSet]].}

The implementation of a concrete [[NatSet]] must provide the following:

<<[[NatSet]] extender's interface>>=
  protected:
    friend class NatSetCopy;

    virtual int us_size() const = 0;
    virtual NatSet* clone() const = 0;
@

where

\begin{quote}
[[us_size()]] returns the size of the underlying finite set.\\
[[clone()]] copies the set into the heap and returns a pointer
	to the copy.
\end{quote}


\subsection{Class [[NatSetDense]]}

The [[NatSetDense]] variant of the natural-number sets has the same
functionality as [[NatSet]], but you can construct one from whole cloth.
It is a represented as a [[BitVector]], which is capable of representing
infinite sets: the complement of a finite set has an ``infinity bit'' of
one, meaning that all the bit positions not explicitly represented are
treated as being one.

<<class [[NatSetDense]]>>=
class NatSetDense : public NatSet {
  public:
    NatSetDense(bool complement = false) { if (complement) us.set_to_ones(); }
    NatSetDense(const NatSet&);
    NatSet& operator=(const NatSet& that);

    bool is_finite() const;
    bool is_empty() const;

    int size() const;

    bool contains(unsigned element) const;

    void insert(unsigned element);
    void remove(unsigned element);

    void insert_all();
    void remove_all();

    void complement();
    
    bool operator==(const NatSet&) const;

    void operator+=(const NatSet&);
    void operator*=(const NatSet&);
    void operator-=(const NatSet&);

    NatSetIter iter(bool complement = false);
    NatSetIter iter(bool complement = false) const;

  protected:
    BitVector us;		// underlying set

    int us_size() const;
    NatSet* clone() const;
};
@


\subsection{Class [[NatSetSparse]].}

The [[NatSetSparse]] class is analogous to [[NatSetDense]] but uses a sorted
linked-list representation.  Apart from its constructors and assignment
operator, [[NatSetSparse]] inherits all methods from [[NatSet]].

<<class [[NatSetSparse]]>>=
class NatSetSparse : public NatSet {
  public:
    NatSetSparse(bool complement = false) : is_complemented(complement) { }
    NatSetSparse(const NatSet&);

    NatSet& operator=(const NatSet &rhs);

    bool is_finite() const;
    bool is_empty() const;

    int size() const;

    bool contains(unsigned element) const;

    void insert(unsigned element);
    void remove(unsigned element);

    void insert_all();
    void remove_all();

    void complement();
    
    bool operator==(const NatSet&) const;

    NatSetIter iter(bool complement = false);
    NatSetIter iter(bool complement = false) const;

  protected:
    Set<unsigned> us;
    bool is_complemented;

    int us_size() const;
    NatSet* clone() const;
};
@

\subsection{Class [[NatSetCopy]].}

The [[NatSetCopy]] class plays a different role from those above.  You use
it when you want to make a private copy of an existing set, giving it the
same representation, and hence the same performance characteristics, as
that original set.

<<class [[NatSetCopy]]>>=
class NatSetCopy : public NatSet {
  public:
    NatSetCopy(const NatSet&);
    NatSet& operator=(const NatSet&);

    bool is_finite() const;
    bool is_empty() const;

    int size() const;

    bool contains(unsigned element) const;

    void insert(unsigned element);
    void remove(unsigned element);

    void insert_all();
    void remove_all();

    void complement();
    
    bool operator==(const NatSet&) const;

    void operator+=(const NatSet&);
    void operator*=(const NatSet&);
    void operator-=(const NatSet&);

    NatSetIter iter(bool complement = false);
    NatSetIter iter(bool complement = false) const;

    void print(FILE* = stdout, unsigned bound = UINT_MAX) const;

  protected:
    int us_size() const;
    NatSet* clone() const;

  private:
    NatSet *own;
};
@



\subsection{Header file [[nat_set.h]]}

The header file for module [[nat_set]] has the following layout.

<<nat\_set.h>>=
/* file "machine/nat_set.h" */

<<Machine-SUIF copyright>>

#ifndef MACHINE_NAT_SET_H
#define MACHINE_NAT_SET_H

#include <machine/copyright.h>

#ifndef SUPPRESS_PRAGMA_INTERFACE
#pragma interface "machine/nat_set.h"
#endif

#include <machine/substrate.h>

class NatSet;
class NatSetIterRep;

<<class [[NatSetIterPure]]>>

<<class [[NatSetIter]]>>

<<class [[NatSet]]>>

<<class [[NatSetDense]]>>

<<class [[NatSetSparse]]>>

<<class [[NatSetCopy]]>>

#endif /* MACHINE_NAT_SET_H */
@
