\section{Class [[Cfg]]}
\label{sec-class-cfg}

The two main classes used in the CFG library are [[Cfg]] and
[[CfgNode]].  A [[Cfg]] represents an entire control flow graph for a
procedure.  The nodes within a [[Cfg]] have class [[CfgNode]].  The
functions operating on class [[Cfg]] handle the graph overall: graph
expansion and simplification, graph validation and printing, and the
enumeration of nodes and edges.  Those operating on [[CfgNode]] are about
managing a node's relations with its control and layout neighbors and
out keeping its code consistent with those relations.


\subsection{CFG construction}

To create a CFG or to normalize the form of an existing CFG, use the
following functions:

<<[[Cfg]] creation and normalization>>=
Cfg* new_cfg(InstrList*, bool keep_layout    = false,
			 bool break_at_call  = false,
			 bool break_at_instr = false);
void canonicalize(Cfg*,  bool keep_layout    = false,
                         bool break_at_call  = false,
                         bool break_at_instr = false);
void remove_layout_info(Cfg*);
@

Creation function [[new_cfg]] parses a linear list of instructions into
basic blocks.  There are options for determining precisely what constitutes
a basic block and whether the original linear layout of the instructions
should be reflected by initial layout constraints.  Given an existing CFG,
you may want to ensure that it has the same form as if built from scratch
with a given option set.  The [[canonicalize]] function serves this
purpose.

Each of the flags [[keep_layout]], [[break_at_call]], and
[[break_at_instr]] is optional.  When [[keep_layout]] is [[true]], the
CFG gets node-layout constraints reflecting the program's initial linear
layout.  When [[break_at_call]] is [[true]], each procedure-call
instruction is treated as a CTI, so that it terminates the CFG node that
contains it.  [[break_at_instr]] means that each node of the graph
contains only one instruction.

The [[remove_layout_info]] function clears all layout constraints in a
given graph.


\subsection{CFG inspection}

<<[[Cfg]] inspectors>>=
CfgNode* get_node(Cfg*, int pos);
CfgNode* get_node(Cfg*, CfgNodeHandle);
CfgNode* node_at(Cfg*, LabelSym*);

int nodes_size(Cfg*);
CfgNodeHandle nodes_start(Cfg*);
CfgNodeHandle nodes_last(Cfg*);
CfgNodeHandle nodes_end(Cfg*);

CfgNode* get_entry_node(Cfg*);
CfgNode* get_exit_node(Cfg*);

void set_entry_node(Cfg*, CfgNode*);
void set_exit_node(Cfg*, CfgNode*);
@

Their capsule summaries:

\begin{quote}
[[get_node(cfg, pos)]] returns the node at [[pos]] in [[cfg]]'s
	node list, where [[pos]] is a position in the list (either a
	zero-based integer or a handle). \\
[[get_node(cfg, label)]]
	returns the node of [[cfg]] at [[label]]. \\[1ex]
[[nodes_size(cfg)]] returns the number of nodes in [[cfg]]. \\
[[nodes_start(cfg)]] returns a handle on the first element of [[cfg]]. \\
[[nodes_last(cfg)]] returns a handle on the last element of [[cfg]]. \\
[[nodes_end(cfg)]] returns the sentinel handle for [[cfg]]. \\[1ex]
[[get_entry_node(cfg)]] returns [[cfg]]'s entry node. \\
[[get_exit_node(cfg)]] returns [[cfg]]'s exit node. \\
[[set_entry_node(cfg, entry)]] sets [[cfg]]'s entry node to [[entry]]. \\
[[set_exit_node(cfg, exit)]] sets [[cfg]]'s exit node to [[exit]].
\end{quote}

Type [[CfgNodeHandle]] is an abbreviation for an STL iterator.

<<[[CfgNodeHandle]] definition>>=
typedef list<CfgNode*>::iterator CfgNodeHandle;
@

\paragraph{Nicknames}
The [[nodes_]] functions are so named because the sequence of node pointers
in a [[Cfg]] is called [[nodes]].  These functions are frequently used;
since a [[Cfg]] instance contains only one such sequence, we can provide
shorter names for these handle-related functions without ambiguity.

<<[[Cfg]] inspectors>>=

inline int
size(Cfg *cfg)
{
    return nodes_size(cfg);
}

inline CfgNodeHandle
start(Cfg *cfg)
{
    return nodes_start(cfg);
}

inline CfgNodeHandle
end(Cfg *cfg)
{
    return nodes_end(cfg);
}
@


\subsection{Graph simplification}

The following functions perform graph simplifications.  Each returns
[[true]] if it succeeds in changing the graph.  Since one kind of
simplification can create the opportunity for others, you may want to
repeat these transformations until no progress is made by any
of them.

<<[[Cfg]] simplification>>=
bool remove_unreachable_nodes(Cfg*);
bool merge_node_sequences(Cfg*);
bool optimize_jumps(Cfg*);
@

The function [[remove_unreachable_nodes(cfg)]] removes nodes
unreachable from the entry of [[cfg]]; it renumbers the CFG and thus may
invalidate node numbers that you may have saved.  The function
[[merge_node_sequences(cfg)]] merges sequences of control-equivalent
nodes in [[cfg]], while [[optimize_jumps(cfg)]] eliminates jumps to
jumps in [[cfg]].


\subsection{Graph printing}

When debugging code that uses a CFG, it is often useful
to be able to print an ASCII representation of the graph.

<<[[Cfg]] printing>>=
void fprint(FILE*, Cfg*, bool show_layout, bool show_code);
@

The optional flag [[show_layout]] causes the printout to indicate the
layout predecessor and/or layout successor of any node that has such
constraints.  The optional flag [[show_code]] causes the instructions in
each node to be printed.  Both flags default to [[false]].


\subsection{Header file [[graph.h]]}

Module [[graph]] has the following header file.

<<graph.h>>=
/* file "cfg/graph.h" -- Control Flow Graph */

<<Machine-SUIF copyright>>

#ifndef CFG_GRAPH_H
#define CFG_GRAPH_H

#include <machine/copyright.h>

#ifndef SUPPRESS_PRAGMA_INTERFACE
#pragma interface "cfg/graph.h"
#endif

#include <machine/machine.h>
#include <cfg/cfg_ir.h>

<<[[CfgNodeHandle]] definition>>

<<[[Cfg]] creation and normalization>>

<<[[Cfg]] inspectors>>

<<[[Cfg]] simplification>>

<<[[Cfg]] printing>>

#endif /* CFG_GRAPH_H */
@
