\section{Utilities}

This section contains utilities for Machine SUIF.

\subsection{Operand scanning and replacement}

Utility [[map_opnds]] applies a functional object to the operands of
an instruction and replaces each by the result returned (which may of
course be the same as the original).  The related utilities
[[map_src_opnds]] and [[map_dst_opnds]] do the same for the source
and destination subsets of an instruction's operands, respectively.

The object applied to each operand is of class [[OpndFilter]].  The
only significant method of this class is its ``apply'' operator
([[operator()]]), which takes and returns an [[Opnd]].  You can
subclass [[OpndFilter]] so that the apply operator examines and/or
transforms operands as you wish and so that an instance of your subclass
carries whatever state might be needed when each operand is visited.
The base class is quite simple:

<<class [[OpndFilter]]>>=
class OpndFilter {
  public:
    OpndFilter(bool thru_addr_exps = true)
	{ _thru_addr_exps = thru_addr_exps; }
    virtual ~OpndFilter() {}

    typedef enum { IN, OUT } InOrOut;

    virtual Opnd operator()(Opnd, InOrOut) = 0;

    bool thru_addr_exps() const { return _thru_addr_exps; }
  protected:
    bool _thru_addr_exps;
};
@

The [[map_opnds]], [[map_src_opnds]], and [[map_dst_opnds]]
functions take the instruction whose operands are to be scanned and a
concrete instance of [[OpndFilter]].

<<function [[map_opnds]]>>=
void map_opnds(Instr *instr, OpndFilter &filter);
void map_opnds(Opnd addr_exp, OpndFilter &filter);
void map_src_opnds(Instr *instr, OpndFilter &filter);
void map_dst_opnds(Instr *instr, OpndFilter &filter);
@

The [[map_opnds]] function applies the filter to each operand,
indicating via a second argument whether the operand is an input
([[IN]]) or an output ([[OUT]]) of the instruction.  It replaces the
original operand by the filter's result.

When [[map_opnds]] reaches an address-expression operand, it applies
the filter to the whole operand before doing any of the constituent
operands.  If the filter returns the address-expression operand
unchanged, then the filter is applied to, and may replace, the
constituent operands of the address expression.  Recall that all
constituent operands of an address expression are instruction inputs,
even if the address expression is a destination operand.  The filter
therefore always receives the indicator [[IN]] when applied to a
suboperand of an address expression.

Function [[map_src_opnds]] is just like [[map_opnds]], but it only
maps the direct source operands of the instruction.  Likewise,
[[map_dst_opnds]] maps only the destination operands.

A typical use for [[map_opnds]] is to rewrite operands that have been
assigned to physical registers.  The filter simply checks whether its
argument is one of those to be rewritten.  If so, it returns the
appropriate hard-register operand; if not, it returns its argument
unchanged.




\subsection{Annotation help}

Certain annotations are used often enough to make it worthwhile defining
utilities to query and update them.

\paragraph{Formal parameters assigned to registers.}

The [[param_reg]] annotation records the assignment of a
formal-parameter symbol to a specific argument register.  The only data
stored is the abstract register number.  The following functions manage
[[param_reg]] annotations.

<<parameter-register utilities>>=
bool is_reg_param(VarSym*);
int get_param_reg(VarSym*);
void set_param_reg(VarSym*, int reg);
@

\begin{tabular}{l@{\hspace*{2em}}p{.7\linewidth}}
[[is_reg_param(]]$p$[[)]]	& Returns [[true]] if parameter symbol $p$ is
			  assigned to a register. \\
[[get_param_reg(]]$p$[[)]]	& Returns the abstract number of the register to
			  which parameter $p$ is assigned, or -1
			  if it's not assigned to a register. \\ 
[[set_param_reg(]]$p$[[, ]]$r$[[)]] & Records the assignment of parameter $p$ to
			  register $r$.
\end{tabular}


\subsubsection{Annotation transfer}

Sometimes when you replace one annotated [[IrObject]] with another, you
want to copy the annotations of the former to the latter.  For this we
have:

<<function [[copy_notes]]>>=
extern void copy_notes(IrObject *from, IrObject *to);
@

Although [[copy_notes]] is most often used when the original [[IrObject]]
is about to be discarded, it clones the annotations, rather than removing
them from the original.


\subsubsection{Annotation suppression during printing}

The following list contains annotation keys.  This list is used by the
Machine-SUIF printing utilities to keep output free of tedious
annotations.  Most printing passes allow you to override this list and
print all annotations.

<<non-printing-note keys>>=
extern Set<IdString> nonprinting_notes;
@


\subsection{Symbol, symbol-table, and type utilities}

\paragraph{Symbol predicates.}

The following set of predicates on symbols distinguish their scopes.

<<symbol predicates>>=
bool is_global(Sym*);
bool is_external(Sym*);
bool is_defined(Sym*);
bool is_private(Sym*);
bool is_auto(VarSym*);
@

Predicate [[is_global]] is true of a symbol whose scope is not local to a
procedure.  It works by finding the symbol table to which the symbol
belongs and checking whether its parent is a [[file_set_block]] or a
[[file_block]].  The latter objects own the global symbol tables.

A symbol satisfying [[is_external]] lives in the external symbol table at
the [[file_set_block]] level, which means it is visible externally.  It may
be defined within the file set or outside of it.

A symbol satisfying [[is_defined]] is a global symbol whose defining
declaration is in the current file set.

A symbol satisfying [[is_private]] lives in the symbol table of a
[[file_block]], which means that it is global, but isn't visible outside of
the file.

An ``automatic'' variable, one whose symbol satisfies [[is_auto]], is local
to a procedure.


\paragraph{Symbol finders.}

[[lookup_local_var]] looks up a variable symbol in the current local scope,
while [[lookup_external_var]] looks up a variable symbol in the external
scope, i.e., one that is either exported from or imported by the file
currently being compiled.  [[lookup_external_proc]] does the same for a
procedure symbol.

<<symbol finders>>=
VarSym* lookup_local_var(IdString name);
VarSym* lookup_external_var(IdString name);
ProcSym* lookup_external_proc(IdString name);
@


\paragraph{Symbol creators.}

These functions each generate a new symbol in the local scope with a name
guaranteed not to clash in that scope.

<<symbol creators>>=
VarSym* new_named_var(TypeId, IdString name);
VarSym* new_unique_var(TypeId, const char *prefix = "_var");
VarSym* new_unique_var(Opnd init, const char *prefix = "_var");
LabelSym* new_unique_label(const char *prefix = "_label");

VarSym* new_dispatch_table_var(MbrNote &);
@

With [[new_named_var]], the resulting variable has exactly the [[name]]
given.  With the others, the caller may provide a prefix string for the
new symbol's name.  The name of the resulting symbol has a numeric suffix
that assures uniqueness in the local scope.

[[new_named_var]] and [[new_unique_var]] each generate a variable symbol.
If given a type, [[new_unique_var]] creates an uninitialized variable with
that type.  When an operand is passed instead of a type, it must be a
numeric-immediate operand.  In that case, the type of the operand becomes
the type of the new variable, and the immediate value becomes its initial
value.

[[new_unique_label]] generates a new local code-label symbol.

[[new_dispatch_table_var]] creates and initializes a new unique variable to
serve as the dispatch table for a multiway branch instruction.  The
annotation passed as its argument carries all the necessary information
about the size of the table and the code labels that become its contents.
As a side effect, this function stores the new variable symbol back into
the argument annotation.


\paragraph{Accessing and changing symbol properties.}

To fetch or store the type of a variable or procedure symbol:

<<symbol-property functions>>=
TypeId get_type(VarSym*);
void set_type(VarSym*, TypeId);

TypeId get_type(ProcSym*);
void set_type(ProcSym*, TypeId);
@

To fetch or set the address-taken attribute of a variable:

<<symbol-property functions>>=

bool is_addr_taken(VarSym*);
void set_addr_taken(VarSym*, bool);
@

Function [[update_dispatch_table_var]] updates one target label in the
dispatch table for a multiway branch.  It takes a [[MbrNote]] and the
zero-based index of the multiway branch case.  The annotation supplies both
the variable symbol whose definition holds the dispatch table and the new
label for the case at the given index.

<<symbol-property functions>>=

void update_dispatch_table_var(MbrNote&, int index);
@

Function [[strip_dispatch_table_var]] decommisions the dispatch table for a
multiway branch, leaving only a single entry that contains a null label
value.  Again, the [[MbrNote]] for the branch is the source of a variable
whose definition gets stripped.

<<symbol-property functions>>=

void strip_dispatch_table_var(MbrNote&);
@



\paragraph{Formal parameter helpers.}

These functions inspect the formal parameters of an optimization unit.

<<formal parameter helpers>>=
int get_formal_param_count(OptUnit*);
VarSym* get_formal_param(OptUnit*, int pos);
@

\paragraph{Symbol-table accessors.}

<<symbol-table accessors>>=
SymTable* external_sym_table();
SymTable* file_set_sym_table();
SymTable* get_sym_table(FileBlock*);
SymTable* get_sym_table(ProcDef*);
@

\paragraph{Symbol-table predicates.}

The following set of predicates on symbol tables identify their level
in the hierarchy.

<<symbol-table predicates>>=
bool is_global(SymTable*);
bool is_external(SymTable*);
bool is_private(SymTable*);
@


\paragraph{Taxonomy of types.}

To keep dependence on SUIF's machinery for composing and decomposing type
objects localized, we define functions that operate on a [[TypeId]].  We
make no attempt to cover all of different kinds of type objects that SUIF
provides.  Back end work doesn't make too much use of types.  When others
features are needed, we'll add them.

SUIF views the type kingdom as consisting of data types, procedure types,
and qualified types.  A data type is the type of a data value other than
code.  A procedure type describes a piece of code that you can call.  A
qualified type is based on a data type, but it describes a storage
location, rather than a value.  To the underlying value type, it adds
qualifiers like [[const]] and [[volatile]].  In SUIF, variable symbols,
array elements, and record fields must have qualified type.

Most Machine SUIF code does not need to make these distinctions, because
the helpers we define here take care of coercing type-object pointers to
the appropriate subclasses.


\paragraph{Predicates on [[TypeId]]'s.}

Here are functions that identify different kinds of types.

<<type helpers>>=
bool is_void(TypeId);
bool is_scalar(TypeId);		// data type other than array or record
bool is_boolean(TypeId);
bool is_integral(TypeId);
bool is_signed(TypeId);		// apply only to an integral type
bool is_floating_point(TypeId);
bool is_pointer(TypeId);
bool is_enum(TypeId);
bool is_record(TypeId);		// e.g., struct or union type
bool is_struct(TypeId);
bool is_union(TypeId);
bool is_array(TypeId);
@


\paragraph{Properties of typed values.}

These functions return the size and the alignment requirement, both
expressed in bits, of the value described by a [[TypeId]].  (To say that a
value must have, e.g., 32-bit alignment means that its address in bytes
must be a multiple of four, i.e., 32/8.)

<<type helpers>>=

int get_bit_size(TypeId);
int get_bit_alignment(TypeId);
@

Once in a while, a back end needs to construct a type identifier.  Here are
functions for creating a pointer type, given the type pointed to (which we
call the \emph{referent} type), and for extracting the referent type from a
[[TypeId]] that is known to be a pointer type.

<<type helpers>>=

TypeId pointer_type(TypeId referent);
TypeId get_referent_type(TypeId);
@

Here's a function that creates an array type from the type of its elements
plus the lower and upper bounds of its index range, and one that extract
the element type of an array type.

<<type helpers>>=

TypeId array_type(TypeId element_type, int lower_bound, int upper_bound);
TypeId get_element_type(TypeId array_type);
@

And this one extracts the result type from a procedure type.

<<type helpers>>=

TypeId get_result_type(TypeId type);
@


\subsection{Cloning}

Function [[deep_clone]] copys one IR object, including all of the
subobjects that it owns.  It is implemented as a template function so that
the result type will always be the same as the argument type.

<<cloning functions>>=
template<class T>
T*
deep_clone(T *object)
{
    return to<T>(object->deep_clone(the_suif_env));
}
@

Function [[renaming_clone]] is used to clone a local IR object (one that is
part of an optimization unit) into a new scope, i.e., one governed by a
different local symbol table.  It takes the object and the symbol table of
the destination scope as arguments.  It returns the cloned object.

<<cloning function>>=

IrObject* renaming_clone(IrObject*, SymTable *receiving_scope);
@

Any local symbols found in the object being cloned are themselves cloned
and inserted in the [[receiving_scope]] table.  If necessary, their names
are changed to avoid clashes with existing symbols of that scope.
Non-local symbols of the cloned object are not cloned, since they remain in
scope.

A typical application for [[renaming_clone]] is in inlining, where the body
of one procedure is cloned and inserted into the body of another.


\subsection{A string utility}

Since the [[strdup]] function, which returns a fresh copy of a C string,
is not standard, we provide [[strdupe]] as a substitute.

<<function [[strdupe]]>>=
inline char *
strdupe(const char *s)
{
    int n = strlen(s) + 1;
    char *r = new char[n];
    return (char *)memcpy(r, s, n);
}
@

\subsection{A hashing utility}

The hash code generator for [[long]] integers, needed by SUIF's hash-map
utility, will eventually be part of base SUIF.  For now:

<<hashing helpers>>=
size_t hash(const unsigned long);
@

\subsection{Printing utilities}

Our convention in the OPI is to overload the function [[fprint]] for
printing to a [[FILE*]].  The following variants of [[fprint]] print
[[IrObject]]s (at least at a quality suitable for debugging) and
extended-precision integers.

<<printing utilities>>=
void fprint(FILE*, IrObject*);
void fprint(FILE*, Integer);
@

\subsection{Sequence utilities}

Some utilities that apply to STL container objects.  Function [[clear]]
makes its list argument empty.

<<sequence utilities>>=
template <class Item>
void clear(list<Item> &l)
{
    l.clear_list();
}
@

Function [[get_last_handle]] returns a handle on the last element of its
list argument, or else the sentinal handle of that list.

<<sequence utilities>>=

template <class Item>
list<Item>::iterator
get_last_handle(list<Item> &l)
{
    return l.get_nth(l.size() - 1);
}
@

<<sequence utilities>>=

template <class Item>
list<Item>::iterator
find(list<Item> &l, const Item &item)
{
    for (list<Item>::iterator h = l.begin(); h != l.end(); ++h)
	if (*h == item)
	    return h;
    return l.end();
}
@

<<sequence utilities>>=

template <class Item>
bool
contains(list<Item> &l, const Item &item)
{
    return find(l, item) != l.end();
}
@

Function [[maybe_expand]] makes sure that a vector's size is large enough
to cover an entry at [[index]].  If not, it resizes the vector to cover
[[index]].  Argument [[init]] provides an initial value for any added
elements.

<<sequence utilities>>=

template <class Item>
void
maybe_expand(Vector<Item> &v, size_t index, const Item &init)
{
    if (index >= v.size())
	v.resize(index + 1, init);
}
@

Function [[end_splice]] works arounf the lack of [[splice]] method in the
[[suif_list]] template.  It moves the elements of its second argument to
the end of the first, leaving the donor list empty.  In the real STL, this
can be done in constant time.

<<sequence utilities>>=

template <class Item>
void
end_splice(List<Item> &l, List<Item> &x)
{
    while (!x.empty())
    {
	l.insert(l.end(), x.front());
	x.pop_front();
    }
}
@


\subsection{Miscellany}

The following are handy when changing the representation of a optimization
unit from instruction-list to, say, CFG form.

<<optimization-unit body functions>>=
AnyBody* get_body(OptUnit*);
void set_body(OptUnit*, AnyBody*);
@


\subsection{Header file [[util.h]]}

The header file for module [[util]] has the following layout.

<<util.h>>=
/* file "machine/util.h" */

<<Machine-SUIF copyright>>

#ifndef MACHINE_UTIL_H
#define MACHINE_UTIL_H

#include <machine/copyright.h>

#ifndef SUPPRESS_PRAGMA_INTERFACE
#pragma interface "machine/util.h"
#endif

#include <machine/substrate.h>
#include <machine/machine_ir.h>
#include <machine/opnd.h>
#include <machine/note.h>

<<class [[OpndFilter]]>>

<<function [[map_opnds]]>>

<<parameter-register utilities>>

<<formal parameter helpers>>

<<function [[copy_notes]]>>

<<non-printing-note keys>>

<<symbol predicates>>

<<symbol finders>>

<<symbol creators>>

<<symbol-property functions>>

<<type helpers>>

<<symbol-table predicates>>

<<symbol-table accessors>>

<<cloning functions>>

<<hashing helpers>>

<<printing utilities>>

<<sequence utilities>>

<<optimization-unit body functions>>

#endif /* MACHINE_UTIL_H */
@
