\section{Machine Operands}
\label{sec:machine-operands}
%
% Some definitions for describing address expressions.
%
\def \XR{\textit{index-register}}					       %
\def \AS{\textit{address-symbol}}					       %
\def \B {\textit{base}}							       %
\def \X {\textit{index}}						       %
\def \S {\textit{scale}}						       %
\def \D {\textit{displacement}}						       %

% And for the HTML audience:
%
% l2h substitution XR <i>index-register</i>
% l2h substitution AS <i>address-symbol</i>
% l2h substitution B <i>base</i>
% l2h substitution X <i>index</i>
% l2h substitution S <i>scale</i>
% l2h substitution D <i>displacement</i>


\subsection{[[Opnd]] interface}
\label{sec:opnd-interface}

An instruction operand in Machine SUIF is an instance of class [[Opnd]].
Whereas you always deal with instructions using a pointer (of type
[[Instr*]]), an operand is treated as a non-pointer value.  Thus operand
creation functions are not called [[new_...]]; their names all have the
prefix [[opnd_]].  They return an [[Opnd]], not an [[Opnd*]], and you don't
need to worry about [[delete]]-ing operands when you're finished with them.

That's not to say that operands cannot have reference behavior.  As
mentioned earlier, class [[Opnd]] encapsulates a pointer to a SUIF object.
If you insert a mutable operand into multiple instructions without cloning
it, then a side effect on one occurrence will affect the others.  You must
avoid this if you want to reuse OPI code in settings other than Machine
SUIF.

\paragraph{Common functions on operands.}

There are several kinds of operands, and most of the operand-related
functions have to do with one particular kind or another.  Before going
through the kinds, we describe some functions that can be applied to any
operand.

To ask what kind of operand is represented by a particular [[Opnd]] value
(e.g., register operand versus address-symbol operand), you use the
[[get_kind]] function, which returns an integer indicator.  As we go
through the operand kinds in the rest of this section, we'll give the
symbolic names for the various operand-kind indicators.

<<[[Opnd]] common functions>>=
int get_kind(Opnd);
@

There are also predicates for testing whether an operand is of a particular
kind.  Although they are applicable to any operand, we'll also list these
predicates as we go through the kinds.

Every operand has a type, i.e., a [[TypeId]] that gives the type of the
value that the operand represents.  You use the [[get_type]] function to
obtain this type.

<<[[Opnd]] common functions>>=
TypeId get_type(Opnd);
@

The other function that all operands have in common is a replication
utility.

<<[[Opnd]] common functions>>=
Opnd clone(Opnd);
@

[[clone]] returns its argument unchanged unless it is of a kind that has
mutable fields.  (Currently, only the address-expression operands have that
property.)  In this case, it returns a copy of the argument operand, one
whose fields can be changed without affecting the original.

Two operands are equal under operator [[==]] only if their kind
and type attributes are equal and they satisfy additional conditions.
These conditions are spelled out below.  The disequality operator ([[!=]])
just inverts the equality result.

Because you sometimes need to map from operands to other types of values,
we define a function that extracts a hash code from an operand.

<<[[Opnd]] common functions>>=
size_t hash(Opnd);
@

For debugging purposes, there is a function for printing operands in a
machine-independent manner.  This is not the way operands are printed in
machine-specific assembly output, of course.  See Section~\ref{sec:printer}
to understand more about of how that works.

<<[[Opnd]] common functions>>=
void fprint(FILE*, Opnd);
@

\vspace*{2ex}
Now let's go through the different kinds of operands and the functions that
relate to each kind.


\paragraph{Null operands.}

These are simply placeholders.  They always have [[void]] type.  Any null
operand is equal to another one.  There are two OPI functions for null
operands:

<<null-operand functions>>=
Opnd opnd_null();
bool is_null(Opnd);
@

Synopses:

\begin{quote}
[[opnd_null()]]		 returns a null operand. \\
[[is_null(opnd)]]	 returns [[true]] if [[opnd]] is null.
\end{quote}

In addition, the default constructor for class [[Opnd]] is publicly
accessible and yields a null operand.

The kind identifier for a null operand is [[opnd::NONE]].


\paragraph{Variable-symbol operands.}

These are occurrences of source-language variables or compiler-generated
temporary variables.  (Not to be confused with virtual registers, which do
not involve a symbol.)

The OPI has the following functions for variable-symbol operands:

<<variable-symbol-operand functions>>=
Opnd opnd_var(VarSym* var);
bool is_var(Opnd);
VarSym* get_var(Opnd);
@

Synopses:

\begin{quote}
[[opnd_var(var)]] returns a variable-symbol operand referring to [[var]]. \\
[[is_var(opnd)]] returns [[true]] if [[opnd]] is a variable-symbol
	operand. \\
[[get_var(opnd)]] returns the variable symbol embedded in [[opnd]].
\end{quote}

The kind identifier for a variable-symbol operand is [[opnd::VAR]].
Two variable-symbol operands are equal if and only if their embedded symbols
are equal.  Since the type of this kind of operand is taken from the type
of the embedded variable-symbol, this definition of equality implicitly
requires equal types.


\paragraph{Register operands.}

These represent machine registers, both the real architectural registers of
the target machine and any virtual registers (sometimes called
pseudo-registers) created by the compiler prior to register allocation.
Each contains a non-negative register number and an explicit type.  The
real hardware registers have numbers that can be interpreted via the
register description of the target machine (see Section~\ref{sec-reg-desc}).

The OPI provides the following functions for register operands:

<<register-operand functions>>=
Opnd opnd_reg(int reg, TypeId, bool is_virtual = false);
Opnd opnd_reg(TypeId);
bool is_reg(Opnd);
bool is_hard_reg(Opnd);
bool is_virtual_reg(Opnd);
int get_reg(Opnd);
@

Synopses:

\begin{quote}
[[opnd_reg(reg, type, is_virtual)]] returns an operand for register [[reg]]. \\
[[opnd_reg(type)]]	 returns a new virtual register operand. \\
[[is_reg(opnd)]]	 returns [[true]] if [[opnd]] is a register. \\
[[is_hard_reg(opnd)]]	 returns [[true]] if [[opnd]] is a hardware register. \\
[[is_virtual_reg(opnd)]] returns [[true]] if [[opnd]] is a virtual register. \\
[[get_reg(opnd)]]	 returns the register number of [[opnd]].
\end{quote}

The usual way to create a virtual-register operand is simply to provide its
type; in this case, an unused virtual number is allocated and incorporated
in the new operand.  To create a hard-register operand, you supply the
register number and the type.  Occasionally, you need to create a
virtual-register operand with a specific number.  In that case, you provide
the number, the type, and the Boolean [[true]] when calling [[opnd_reg]].

The kind identifier for hardware-register operands is [[opnd::REG_HARD]].
The kind identifier for virtual-register operands is [[opnd::REG_VIRTUAL]].
Two register operands are equal if and only if they are of the same kind,
and their [[reg]] and [[type]] attributes are are pairwise equal.


\paragraph{Immediate operands.}

There are two kinds of immediate operands: one holds integers and the
other strings.  Integer immediates can represent either signed or
unsigned integer immediate values of any magnitude.  The accompanying
type gives the intended precision.  For floating-point immediate
operands, we use strings describing the numeric value, again paired with
a type.  We also use untyped immediate string operands, though they
don't appear in operate instructions.  They allow the pseudo-instruction
([[dot]]) class, which sometimes requires string literals, to use the
same operand interface as the other instruction classes.

The OPI functions for immediate operands are as follows:

<<immediate-operand functions>>=
Opnd opnd_immed(int, TypeId);
Opnd opnd_immed(Integer, TypeId);
Opnd opnd_immed(IdString, TypeId);
Opnd opnd_immed(IdString);

bool is_immed(Opnd);
bool is_immed_integer(Opnd);
bool is_immed_string(Opnd);

int get_immed_int(Opnd);
Integer get_immed_integer(Opnd);
IdString get_immed_string(Opnd);
@

\begin{quote}
[[opnd_immed(integer, type)]] returns an integer immediate operand with type
	[[type]]. \\
[[opnd_immed(string, type)]] returns a floating immediate operand with
	type [[type]]. \\
[[opnd_immed(string)]] returns a string immediate operand,  with no type. \\
[[is_immed(opnd)]] returns [[true]] if [[opnd]] is an immediate. \\
[[is_immed_integer(opnd)]]
	returns [[true]] if [[opnd]] is an integer immediate. \\
[[is_immed_string(opnd)]]
	returns [[true]] if [[opnd]] is a string immediate. \\
[[get_immed_int(opnd)]]
	returns the value of an integer immediate operand as a
	C [[int]], or raises an exception if that's impossible. \\
[[get_immed_integer(opnd)]]
	returns the value of an integer immediate operand.\\
[[get_immed_string(opnd)]]
	returns the value of a string immediate operand.
\end{quote}

Note that, when operand [[opnd]] satisfies [[is_immed_string(opnd)]], you
can tell whether it's numeric or not by checking its type:
[[get_type(opnd)]] returns [[0]] if [[opnd]] represents a
simple string literal.  (In practice, the nature of the immediate operand
is nearly always apparent from the context.)

The kind identifier for an integer-immediate operand is
[[opnd::IMMED_INTEGER]]; that for a string-immediate operand is
[[opnd::IMMED_STRING]].  Two immediate operands are equal if and only
if they are of the same kind and their [[type]] and [[value]] attributes
are pairwise equal.



\paragraph{Address operands.}

The remaining kinds of builtin operands are address symbols and address
expressions, which together we call \emph{address operands}.  An address
operand represents the effective-address (EA) calculation performed during
a machine instruction to generate a memory address.  The type of an address
operand is always a pointer type.  Such an operand is created with the
pointer-to-[[void]] type appropriate for the compilation target.  (This is
the same [[TypeId]] that results from evaluating [[type_ptr]].  See
Section~\ref{sec:types}.)

It is often useful to be able to determine the \emph{referent} type of an
address operand, i.e., the type of the value obtained by dereferencing the
address that the operand represents.  Obviously, this information could be
incorporated in the address-operand type if the OPI required precise
pointer-type constructors.  But that would entail a good deal of pointer
type composition and decomposition at optimization time.  So instead,
address operands carry an additional type attribute called [[deref_type]].
You access and update it it using:

<<address-operand functions>>=
TypeId get_deref_type(Opnd);
void set_deref_type(Opnd, TypeId);
@
				  
If non-null, this attribute gives the type of the dereferenced value.
(We explain below when and why the referent type attribute may be null.)

There is a predicate satisfied only by an address operand:

<<address-operand functions>>=
bool is_addr(Opnd);
@


\paragraph{Address symbols.}

The simplest kind of address operand is the address symbol, which
represents the address of a variable symbol or a code label symbol.

The OPI functions for address symbols are:

<<address-symbol-operand functions>>=
Opnd opnd_addr_sym(Sym* sym);
bool is_addr_sym(Opnd);
Sym* get_sym(Opnd addr_sym);
@

Synopses:

\begin{quote}
[[opnd_addr_sym(sym)]] returns an address symbol based on [[sym]]. \\
[[is_addr_sym(opnd)]] returns [[true]] if [[opnd]] is an address symbol. \\
[[get_sym(opnd)]] returns the symbol underlying [[opnd]].
\end{quote}

The kind identifier for an address-symbol operand is [[opnd::ADDR_SYM]].
Two address-symbol operands are equal if and only if they have equal
[[sym]] values.  
The [[deref_type]] of an address symbol is always derived from the
underlying symbol.  If it's a variable or procedure symbol, then the
[[deref_type]] is the type of the variable or procedure.  If it's a label
symbol, then [[get_deref_type]] returns [[NULL]].

Recall our earlier example of an $x$86 add-to-memory instruction, with
[[addr]] representing the address of the memory location that is
incremented by [[value]].  If the memory location is that of variable
[[foo]], then operand [[addr]] would be an address symbol constructed from
the [[VarSym*]] for [[foo]] (which we might call [[var_foo]]).  The code
might look like this:

\begin{verbatim}
    Opnd addr = opnd_addr_sym(var_foo);

    append(body, new_instr_alm(addr, ADD, addr, value));
\end{verbatim}


\paragraph{Address expressions.}

There are several varieties of address expression, corresponding to the
addressing modes used in hardware.  We use address operands to represent
whole EA calculations.  Like address symbols, all address expression
operands are created with a type attribute equal to [[type_ptr]].

The unique characteristics of address expressions are that they embed other
operands and they are mutable: you can replace the operands that make up an
address expression.  For example, a ``base-plus-displacement'' operand
represents the address that results at run time from adding a fixed
displacement to the contents of a base register.  Earlier, for instance, we
used the example of an $x$86 stack cell located at [[$esp + 40]].  The
corresponding address-expression operand would contain the register operand
for [[$esp]] as its base part and an immediate operand for the number
[[40]] as its displacement part.

A typical program transformation that could change the constituent operands
of an address expression is register allocation.  A base-plus-displacement
operand with a virtual register as its base would be modified when the
virtual register is allocated to a physical one.

Because address expressions are mutable, you should be aware that the
implementation allows the mutable parts to be shared between occurrences.
Machine SUIF doesn't force you to maintain separate copies, since this
would often be semantically pointless.  In our earlier example of the $x$86
add-to-memory instruction that updates stack location [[40(%esp)]], we
didn't bother to clone the address expression before using it a second
time:
\begin{verbatim}
    append(body, new_instr_alm(addr, ADD, addr, value));
\end{verbatim}
Neither its base register [[%esp]] nor its displacement [[40]] will be
affected by a register allocator, and in any case, the hardware sees [[addr]]
as a single operand, not as two separate once.  Be warned, though, that
Machine SUIF doesn't guarantee to preserve sharing patterns that you may
establish in this way.  For example, when an instruction
is \emph{cloned}, i.e., replicated, its address expressions are replicated
individually.

To produce the variable of type [[Opnd]] called [[addr]] in the above
example, we might write:

\begin{verbatim}
    Opnd esp  = opnd_reg(REG_esp, type_ptr);
    Opnd i40  = opnd_immed(40, type_s32);
    Opnd addr = BaseDispOpnd(esp, i40, type_s32);
\end{verbatim}

Here [[REG_esp]] is an integer variable previously set to the register
number for [[%esp]] (see Section~\ref{sec-reg-desc}).  It is incorporated
into a register operand [[esp]] with the generic pointer type that is
appropriate for the current target; the OPI calls this [[type_ptr]].
Next, the displacement operand [[i40]] is created and given a signed 32-bit
integral type.  Finally, the address expression [[addr]] is constructed
from these base and displacement suboperands.  The third argument to the
constructor [[BaseDispOpnd]] is the [[deref_type]], i.e., the type of the
value in the address that [[addr]] refers to.

Sometimes an address expression is used purely for address computation; the
referent type is unknown and unneeded.  For those cases, the OPI allows you
to omit it, which leaves it [[NULL]].  For example, in Alpha code, a stack
frame is allocated by subtracting a constant from the stack-pointer
register.  This is usually expressed using a load-address instruction:

\begin{verbatim}
    Opnd sp = opnd_reg(REG_sp, type_ptr);   // stack pointer reg ($sp)
    Opnd sz = opnd_immed(-80,  type_s32);   // -(frame size)

    append(body, new_instr_alm(sp, LDA, BaseDispOpnd(sp, sz)));
\end{verbatim}

where [[LDA]] is the Alpha load-address opcode.

Machine SUIF includes these functions for testing and creating address
expressions:

<<address-expression-operand functions>>=
bool is_addr_exp(Opnd);
Opnd opnd_addr_exp(int kind);
Opnd opnd_addr_exp(int kind, TypeId deref_type);
@

\begin{quote}
[[is_addr_exp(opnd)]] returns [[true]] when [[opnd]] is an
	address expression. \\
[[opnd_addr_exp(kind, deref_type)]] returns an address expression operand
	of kind [[kind]] referring to a value of type [[deref_type]].
\end{quote}

The [[kind]] argument to [[opnd_addr_exp]] must be one associated with a
specific address-expression kind, e.g., one of those in
Table~\ref{tab:addr-exp-kinds}.  There is no way to create a ``generic''
address expression.

The [[deref_type]] argument to [[opnd_addr_exp]] may be omitted, in which
case the resulting operand's referent type is unspecified (and it equals
zero when tested).

\paragraph{Programming with address expressions.}

The function [[opnd_addr_exp]] described above is \emph{not} part of the
OPI.  It's meant for extenders of the system.  In optimization passes or
libraries, you should use class [[AddrExpOpnd]] and its subclasses as the
interface to address expressions.

<<class [[AddrExpOpnd]]>>=
class AddrExpOpnd : public Opnd {
  public:
    AddrExpOpnd() { }
    AddrExpOpnd(const Opnd&);
    AddrExpOpnd(Opnd&);
    AddrExpOpnd(int kind, TypeId deref_type = 0)
	: Opnd(opnd_addr_exp(kind, deref_type)) { }

    TypeId get_deref_type() const;

    int srcs_size() const;
    OpndHandle srcs_start()const ;
    OpndHandle srcs_end() const;
    Opnd get_src(int pos) const;
    Opnd get_src(OpndHandle handle) const;
    void set_src(int pos, Opnd src);
    void set_src(OpndHandle handle, Opnd src);
};
@

Here

\begin{quote}
[[AddrExpOpnd(kind, deref_type)]] produces a new address expression with
	the given kind and referent type. (The latter is optional.) \\
[[srcs_size()]] returns the size of the address expression's [[srcs]]
	sequence. \\
[[srcs_start()]] gives a handle on the first element of [[srcs]], if any. \\
[[srcs_end()]] gives the end-sentinel of [[srcs]]. \\
[[get_src(p)]] returns the source of the address expression at position
	[[p]], which may be a zero-based integer or a handle. \\
[[set_src(p, src)]] substitutes  [[src]] for the source of the address
	expression at position [[p]] (which may be a zero-based integer
	or a handle).
\end{quote}

Both [[get_src]] and [[set_src]] extend the [[srcs]] sequence if necessary.

Just as an instruction contains a sequence of direct source operands that
we call its [[srcs]] field, every address expression has a [[srcs]]
sequence, and the same functions for scanning, accessing, and changing the
elements of the sequence apply.  For each kind of address expression, there
is a subclass of Opnd that allows referring to source operands by more
descriptive field names, such as [[base]] or [[disp]], but these fields are
just elements of the overall source sequence.

Although address expressions describe how to compute the {\em value} of an
effective address, they don't encode side effects that might be associated
with a particular addressing mode, such as post-increment.

For two address expression operands to be equal under the [[==]] operator,
they must be of the same kind and have the same [[deref_type]], and their
source operands must be equal in number and pairwise equal under [[==]].


\paragraph{Specific address-expression kinds.}

The [[AddrExpOpnd]] interface is convenient for operations that are
indifferent to the specific kind of an address expression.  A register
allocator, for instance, may make substitutions among the components of an
address expression without needing to know the particular kind.  For other
purposes, though, a specific interface is needed for each kind of address
expression that the target machine supports.


%
% Some definitions for describing address expressions.
%
\def \XR{\textit{index-register}}					       %
\def \AS{\textit{address-symbol}}					       %
\def \B {\textit{base}}							       %
\def \X {\textit{index}}						       %
\def \S {\textit{scale}}						       %
\def \D {\textit{displacement}}						       %

% And for the HTML audience:
%
% l2h substitution XR <i>index-register</i>
% l2h substitution AS <i>address-symbol</i>
% l2h substitution B <i>base</i>
% l2h substitution X <i>index</i>
% l2h substitution S <i>scale</i>
% l2h substitution D <i>displacement</i>

The specific kinds of address expression that the OPI defines are summarized in
Table~\ref{tab:addr-exp-kinds}, which matches each kind indicator with the
corresponding form of address expression.

%
\begin{table}
\vspace*{2ex}
\[
\begin{array}{l@{\hspace{3em}}l@{\hspace{2em}}lcccc}
\mbox{[[opnd::SYM_DISP]]}              &     &   & \AS	     & + & \D & \\
\mbox{[[opnd::INDEX_SYM_DISP]]}        & \XR & + & \AS	     & + & \D & \\
\mbox{[[opnd::BASE_DISP]]}             & \B  &   &	     & + & \D & \\
\mbox{[[opnd::BASE_INDEX]]}            & \B  & + & \X	     &   &    & \\
\mbox{[[opnd::BASE_INDEX_DISP]]}       & \B  & + & \X	     & + & \D & \\
\mbox{[[opnd::INDEX_SCALE_DISP]]}      &     &   & \X\times\S & + & \D & \\
\mbox{[[opnd::BASE_INDEX_SCALE_DISP]]} & \B  & + & \X\times\S & + & \D & 
\end{array}
\] \\
									       %
where

\begin{center}
\begin{tabular}{lp{.5\textwidth}}
$\textit{index-register}$	     & is a register operand		      \\
$\textit{address-symbol}$	     & is an address-symbol operand	      \\
$\textit{base}$ and $\textit{index}$ & are either register or variable-symbol
				       operands; the latter stand for
				       variables to be assigned to registers  \\
$\textit{scale}$		     & is an immediate operand containing an
				       unsigned integer (typically a small
				       power of 2)			      \\
$\textit{displacement}$		     & is an immediate operand containing a
				       signed integer
\end{tabular}
\end{center}
\caption{Address expression categories.}
\label{tab:addr-exp-kinds}
\hrulefill
\end{table}
%
Each of these address-expression kinds has an explicit name for each element of
its [[srcs]] sequence, so that it can be fetched and replaced using
mnemonically-named methods.  Here are the field names:

\begin{tabular}{l@{\hspace*{2em}}p{.5\linewidth}}
[[addr_sym]]	& an address symbol				\\
[[disp]]	& an integer immediate				\\
[[index]]	& a register operand				\\
[[base]]	& a register or variable-symbol operand		\\
[[scale]]	& an integer immediate
\end{tabular}


For each kind of address expression, we now describe the subclass of
[[Opnd]] that gives access to the applicable fields through its
methods. For each of these kinds, there is an operand-kind identifier, a
creation function (prefix [[opnd_]]) and a predicate (prefix [[is_]]).


\paragraph{The \( \AS + \D\) kind.}

This address form represents a fixed displacement ([[disp]]) from the
memory address of a symbol ([[addr_sym]]).  Its [[Opnd]] subclass is
[[SymDispOpnd]].  Its kind identifier is [[opnd::SYM_DISP]].

Class [[SymDispOpnd]] gives the methods for composing, inspecting and
changing symbol-plus-displacement operands:

<<class [[SymDispOpnd]]>>=
class SymDispOpnd : public AddrExpOpnd {
  public:
    SymDispOpnd(Opnd addr_sym, Opnd disp, TypeId deref_type = 0);
    SymDispOpnd(const Opnd&);
    SymDispOpnd(Opnd&);

    Opnd get_addr_sym() const;
    void set_addr_sym(Opnd);
    Opnd get_disp() const;
    void set_disp(Opnd);
};

bool is_sym_disp(Opnd);
@


\paragraph{The \( \XR + \AS + \D \) kind.}

This address form combines the run-time value of an index register with
a fixed displacement and the memory address of a symbol.  Its kind identifier is
[[opnd::INDEX_SYM_DISP]].

Class [[IndexSymDispOpnd]] gives the methods for composing, inspecting and
changing index-plus-symbol-plus-displacement operands:

<<class [[IndexSymDispOpnd]]>>=
class IndexSymDispOpnd : public AddrExpOpnd {
  public:
    IndexSymDispOpnd(Opnd index, Opnd addr_sym, Opnd disp,
		     TypeId deref_type = 0);
    IndexSymDispOpnd(const Opnd&);
    IndexSymDispOpnd(Opnd&);

    Opnd get_index() const;
    void set_index(Opnd);
    Opnd get_addr_sym() const;
    void set_addr_sym(Opnd);
    Opnd get_disp() const;
    void set_disp(Opnd);
};

bool is_index_sym_disp(Opnd);
@


\paragraph{The \( \B + \D \) kind.}

This address form combines the value of a base register with a fixed
displacement.  Its [[Opnd]] subclass is [[BaseDispOpnd]].  Its kind
identifier is [[opnd::BASE_DISP]].  The base operand may either be a
register operand or a variable symbol that represents a register candidate.

Class [[BaseDispOpnd]] gives the methods for composing, inspecting and
changing base-plus-displacement operands:

<<class [[BaseDispOpnd]]>>=
class BaseDispOpnd : public AddrExpOpnd {
  public:
    BaseDispOpnd(Opnd base, Opnd disp, TypeId deref_type = 0);
    BaseDispOpnd(const Opnd&);
    BaseDispOpnd(Opnd&);

    Opnd get_base() const;
    void set_base(Opnd);
    Opnd get_disp() const;
    void set_disp(Opnd);
};

bool is_base_disp(Opnd);
@

\paragraph{The \( \B  + \X \) kind.}

This address form combines the values two registers, base and index.  Its
[[Opnd]] subclass is [[BaseIndexOpnd]]. Its kind identifier is
[[opnd::BASE_INDEX]].

Class [[BaseIndexOpnd]] gives the methods for composing, inspecting and
changing base-plus-index operands:

<<class [[BaseIndexOpnd]]>>=
class BaseIndexOpnd : public AddrExpOpnd {
  public:
    BaseIndexOpnd(Opnd base, Opnd index, TypeId deref_type = 0);
    BaseIndexOpnd(const Opnd&);
    BaseIndexOpnd(Opnd&);

    Opnd get_base() const;
    void set_base(Opnd);
    Opnd get_index() const;
    void set_index(Opnd);
};

bool is_base_index(Opnd);
@


\paragraph{The \( \B + \X + \D \) kind.}

This address form combines two registers (base and index) with a fixed
displacement.  Its [[Opnd]] subclass is [[BaseIndexDispOpnd]].  Its kind
identifier is [[opnd::BASE_INDEX_DISP]].

Class [[BaseIndexDispOpnd]] gives the methods for composing, inspecting and
changing base-plus-index-plus-displacement operands:

<<class [[BaseIndexDispOpnd]]>>=
class BaseIndexDispOpnd : public AddrExpOpnd {
  public:
    BaseIndexDispOpnd(Opnd base, Opnd index, Opnd disp, TypeId deref_type = 0);
    BaseIndexDispOpnd(const Opnd&);
    BaseIndexDispOpnd(Opnd&);

    Opnd get_base() const;
    void set_base(Opnd);
    Opnd get_index() const;
    void set_index(Opnd);
    Opnd get_disp() const;
    void set_disp(Opnd);
};

bool is_base_index_disp(Opnd);
@


\paragraph{The \( \X\times\S + \D \) kind.}

This address form multiplies an index register's value by a fixed scale
factor and adds a fixed displacement.  Its [[Opnd]] subclass is
[[IndexScaleDispOpnd]].  Its kind identifier is [[opnd::INDEX_SCALE_DISP]].

<<class [[IndexScaleDispOpnd]]>>=
class IndexScaleDispOpnd : public AddrExpOpnd {
  public:
    IndexScaleDispOpnd(Opnd index, Opnd scale, Opnd disp,
		       TypeId deref_type = 0);
    IndexScaleDispOpnd(const Opnd&);
    IndexScaleDispOpnd(Opnd&);

    Opnd get_index() const;
    void set_index(Opnd);
    Opnd get_scale() const;
    void set_scale(Opnd);
    Opnd get_disp() const;
    void set_disp(Opnd);
};

bool is_index_scale_disp(Opnd);
@


\paragraph{The \( \B + \X\times\S + \D \) kind.}

This address form combines the value of a base register with the result of
scaling an index register and adding a fixed displacement.  Its [[Opnd]]
subclass is [[BaseIndexScaleDispOpnd]].  Its kind identifier is
[[opnd::BASE_INDEX_SCALE_DISP]].

<<class [[BaseIndexScaleDispOpnd]]>>=
class BaseIndexScaleDispOpnd : public AddrExpOpnd {
  public:
    BaseIndexScaleDispOpnd(Opnd base, Opnd index, Opnd scale, Opnd disp,
			   TypeId deref_type = 0);
    BaseIndexScaleDispOpnd(const Opnd&);
    BaseIndexScaleDispOpnd(Opnd&);

    Opnd get_base() const;
    void set_base(Opnd);
    Opnd get_index() const;
    void set_index(Opnd);
    Opnd get_scale() const;
    void set_scale(Opnd);
    Opnd get_disp() const;
    void set_disp(Opnd);
};

bool is_base_index_scale_disp(Opnd);
@


\subsection{Class [[OpndCatalog]]}

It is often useful in program analysis and optimization to map operands to
a dense range of natural numbers.  Class [[OpndCatalog]] is an abstract
interface for such a map.  For different problems, it is realized by
different concrete subclasses.  In bit-vector data-flow problems, the
numbers assigned to operands are often called \emph{slots}, meaning
positions in the bit vectors.  Here we'll call them \emph{indices}.

<<class [[OpndCatalog]]>>=
class OpndCatalog {
  public:
    virtual ~OpndCatalog();

    virtual int size() const = 0;
	    int num_slots() const { return size(); }

    virtual bool enroll(Opnd, int *index = NULL) = 0;
    virtual bool lookup(Opnd, int *index = NULL) const = 0;

    virtual void print(FILE* = stdout) const;

<<[[OpndCatalog]] protected parts>>
};
@

Here's how the methods are used.

\begin{tabular}{l@{\hspace*{2em}}p{.7\linewidth}}
[[size()]]		& Returns the total number of indices allocated so
			  far for all enrolled operands. \\
[[num_slots()]]		& A deprecated synonym for [[size()]]. \\
{\tt enroll($o$, $i$)}	& Tries to put operand $o$ under management.  Returns
			  [[true]] exactly when $o$ has not been
			  enrolled before and is entered successfully
			  (not filtered out).  In addition, the optional
			  index pointer $i$ serves as an output
			  variable.  If $i$ is non-null, the index
			  for operand $o$ (if it has one) is stored
			  in the location that $i$ points to.  This
			  return of $o$'s index via $i$
			  occurs whether $o$ is newly enrolled or was in
			  the catalog previously. \\
{\tt lookup($o$, $i$)}	& Returns [[true]] if operand $o$ already in the
			  catalog.  In that case, and if the optional
			  index pointer $i$ is non-null, the index
			  for $o$ is stored into the location $i$
			  points to.  Never makes a new entry in the
			  catalog. \\
[[print(]]$c$[[)]]	& Does nothing if the catalog hasn't kept a
			  record of operands enrolled.  If there is such
			  a roll, [[print]] lists each member in order
			  of entry on the output channel $c$, together
			  with the index assigned to it.
\end{tabular}


\subsection{[[Opnd]] implementation}
\label{sec:opnd-implementation}

<<[[Opnd]] definition>>=
class Opnd {
  public:
    Opnd();
    Opnd(const Opnd &other) { o = other.o; }
    Opnd(IrOpnd *other_o) { o = other_o; }
    Opnd & operator=(const Opnd &other) { o = other.o; return *this; }
    ~Opnd() { }

    bool operator==(const Opnd &other) const;
    bool operator!=(const Opnd &other) const { return !(*this == other); }

    operator IrOpnd*() { return o; }
    operator bool();

  protected:
    IrOpnd *o;
};
@

Type [[OpndHandle]] is used for scanning the [[srcs]] field of an address
expression.  The implementation of this field has type
[[suif_vector<IrOpnd>]].  Therefore, [[OpndHandle]] is an alias for the
iterator type associated with this sequence type.

<<[[OpndHandle]] definition>>=
typedef suif_vector<IrOpnd*>::iterator OpndHandle;
@


\paragraph{Current optimization unit.}
To be able to record operands in the appropriate symbol table and to manage
virtual register numbers, we need to focus on the current optimization unit
when its processing starts and to defocus when it ends.

<<focus functions>>=
void focus(FileBlock*);
void focus(OptUnit*);

void defocus(OptUnit*);
@

<<scope management>>=
extern FileBlock *the_file_block;
extern ScopeTable *the_file_scope;

extern OptUnit *the_local_unit;
extern ScopeTable *the_local_scope;
@

<<virtual-register management>>=
extern int the_vr_count;

int next_vr_number();
@


\paragraph{Partial implementation of [[OpndCatalog]].}

Since [[OpndCatalog]] is an interface, meant to be subclassed, it has no
public constructor.  Its protected constructor takes a Boolean argument
[[record]] that controls whether the catalog remembers the operands
enrolled (using a list called \emph{roll}) or not.  The only purpose of the
roll is to enable printing of the catalog for debugging.
    
<<[[OpndCatalog]] protected parts>>=
    protected:
      OpndCatalog(bool record = false)
	  { _roll = record ? new List<Opnd> : NULL; }

      List<Opnd> *roll() const { return _roll; }

    private:
      List<Opnd> *_roll;
@



\subsection{Header file [[opnd.h]]}

The header file for operands has the following outline:

<<opnd.h>>=
/* file "machine/opnd.h" */

<<Machine-SUIF copyright>>

#ifndef MACHINE_OPND_H
#define MACHINE_OPND_H

#include <machine/copyright.h>

#ifndef SUPPRESS_PRAGMA_INTERFACE
#pragma interface "machine/opnd.h"
#endif

#include <machine/substrate.h>

class IrOpnd;

namespace opnd {

enum { NONE,
       REG_HARD,
       REG_VIRTUAL,
       VAR,
       IMMED_INTEGER,
       IMMED_STRING,
       ADDR_SYM,
       SYM_DISP,
       INDEX_SYM_DISP,
       BASE_DISP,
       BASE_INDEX,
       BASE_INDEX_DISP,
       INDEX_SCALE_DISP,
       BASE_INDEX_SCALE_DISP,
};

} // namespace opnd

#define LAST_OPND_KIND opnd::BASE_INDEX_SCALE_DISP

<<[[OpndHandle]] definition>>

<<[[Opnd]] definition>>

<<[[Opnd]] common functions>>

<<null-operand functions>>

<<variable-symbol-operand functions>>

<<register-operand functions>>

<<immediate-operand functions>>

<<address-operand functions>>

<<address-symbol-operand functions>>

<<address-expression-operand functions>>

<<class [[AddrExpOpnd]]>>

<<class [[SymDispOpnd]]>>

<<class [[IndexSymDispOpnd]]>>

<<class [[BaseDispOpnd]]>>

<<class [[BaseIndexOpnd]]>>

<<class [[BaseIndexDispOpnd]]>>

<<class [[IndexScaleDispOpnd]]>>

<<class [[BaseIndexScaleDispOpnd]]>>

<<focus functions>>

<<scope management>>

<<virtual-register management>>

<<class [[OpndCatalog]]>>

#endif /* MACHINE_OPND_H */
@
