\section{Run-time Support for Tasking}

\subsection{Overview}

The Net-Fx run-time system is primarily responsible for transporting
distributed data structures between submodules.  This communication
should avoid synchronizing the source and destination submodules, but
at the same time, it must support changing data distribution and
submodule bindings.  The details of these bindings, distribution
computation, and of actual data transport over a likely complex and
irregular network are hidden below simple, high-level, non-blocking
collective send and receive calls for whole distributed data
structures.

Within the run-time system itself, the tasks of distribution
computation, maintaining bindings, and communication are segregated
into subsystems with well defined interfaces between them.  This makes
it possible to orthogonally extend each subsystem.  Although we expect
that only a small set of binding semantics will have to be supported,
the number of distribution types needed will grow as Net-Fx is used to
coordinate non-Fx submodules.  Further, to achieve high performance,
it will be necessary to specialize the communication system for
different networks.  By clearly defining the interfaces between
subsystems, we make this specialization possible and easy.

The communication subsystem is the heart of the run-time system and
exports high-level send and receive calls to the Net-Fx program.  Well
defined interfaces connect it to binding subsystem and the
distribution help subsystem.  The binding subsystem maps abstract
submodule and data structure identifiers into their current physical
realization and distributions, respectively.  The distribution help
subsystem performs distribution computation to convert between the
source and destination distributions and distribution types.  Because
some distribution types may not be supported directly by the
distribution help subsystem, or because a faster support function
exists, the Net-Fx program can register support functions for use by
the distribution help subsystem.  Both the registration interface and
the calling convention for such support functions are standardized so
that they are not tied to any implementation of the run-time system.


\subsection{Motivation}

Communicating distributed data structures between submodules can involve 
quite complex {\em distribution computation} performed by code generated by a 
compiler or built into a run-time system.  Traditionally, this code has 
used low level communication primitives supplied by an oblivious 
communication system to actually transport data.  This works fine when the 
network is simple and highly regular, such as the network of a parallel 
supercomputer.  However, The network that Net-Fx orchestrated tasks will 
use to communicate may have an arbitrary topology, links with different 
speeds, and many different protocol stacks.  Further, the network may 
have other, unrelated traffic.  This added complexity forces a rethinking 
of how to achieve high performance communication between submodules.  

\subsection{Bindings
There are three kinds of bindings that are
supported.  Sta

\subsection{Scheme}
Clearly, enabling high performance communication of distributed data 
structures over complex networks is difficult, so we propose a divide and 
conquer approach, dividing the task between the {\em communication system} 
and the {\em distribution help system}.  The job of the communication 
system is to provide high performance collective communication between the 
submodules.  The job of the distribution help system is to perform the 
necessary distribution computation to determine what to actually send and 
to whom.  

The communication system provides an {\em intertask communication API} to 
the Net-Fx program.  This interface consists of non-blocking collective 
send and receive calls, which include the source and target data 
distributions.  A send and receive is invoked on each process of the 
sending and receiving submodules, respectively.  The communication system 
can either interpret the distributions itself, or pass them to the 
distribution help system through the standard {\em distribution help 
interface}.  On the source submodule, the communication system can request 
that a message be assembled for a process in the destination submodule, or 
an {\em index relation} between the locations of data elements local to the 
source processor and their locations on the destination processor be 
computed and returned.   The communication system on the destination 
submodule has symmetric options.  

Some distribution computation functions may be built into the distribution 
help system, but we have found that specializing functions for particular 
distributions via compile-time optimizations can greatly improve 
performance.  On another note, new distribution types may need to be 
supported without adding that support directly to the distribution help 
system.  For these reasons we permit custom distribution computation 
functions linked to the Net-Fx executable to be {\em registered} for use
by the distribution help system.  All such function must adhere to a 
single interface.  For clarity, consider an example:  Before
performing a communication for which it has a custom distribution 
computation function, the Net-Fx program will register that function.
This establishes a binding between the distributions the function supports
and the function itself.  The program then invokes a send or receive,
passing those distribution as arguments.  The communication system will
then call the distribution help system with the distributions.  At this
point, it will be recognized that there exists a custom function for
the distributions and it will be invoked.  



\subsection{Binding Issues and Synchronization}

Everything said up to here has concern static submodule and data 
distribution bindings.  Allowing for dynamic bindings opens up a can of 
worms because we may want to require a protocol to support them.  




