\documentstyle{article}
\setlength{\topmargin}{0in}
\setlength{\headheight}{0in}
\setlength{\headsep}{0in}
\setlength{\textheight}{9in}

\setlength{\oddsidemargin}{0in}
\setlength{\evensidemargin}{0in}
\setlength{\textwidth}{6.5in}
\setlength{\marginparsep}{0in}
\setlength{\marginparwidth}{0in}


\title{A Specification for a Tasking Interface}
\author{}

\begin{document}

\maketitle

\section{Introduction}

This document describes an {\em intertask communication} programming 
interface to support task parallel programs.  If a communication system 
exports such an interface, task parallel programs can easily be run on top 
of it.  In addition to the interface exported by the communication system, 
this document also defines a {\em distribution help} interface imported by 
the communication system.  The idea is to simplify the task of the 
communication system developer by providing a simple interface for address 
computations involving any two distributions.  The communication system can 
request either that an assembled message or an address relation be 
returned.  Since address computations may be compile--time optimized, a 
{\em distribution help registry} interface is defined to allow programs to 
register custom address computation functions.  Registered functions use 
a {\em distribution tools} interface to pack elements into a message 
buffer or add tuples to the address relation.

\section{Tasking Paradigm}

A {\em task} is a data parallel subroutine, possibly with internal 
communication, that executes on some number of identical processors.  
Several tasks may be {\em clustered} into a single {\em module}, which then 
executes them in some order on an architecturally compatible subset of the 
available processors.  Modules execute on disjoint, architecturally 
compatible subsets of the available processors.  In order to fulfill data 
dependencies, {\em distributed arrays} are communicated between modules.

\section{Intertask Communication}

This interface is exported to the program from the communication system
and provides a mechanism for initializing it for a task parallel program,
communicating between tasks, and deinitializing.

\subsection{Initialization}

{\em This is weak...}

Each process will initialize the communication system by describing all
the modules using the following communication system function:
\begin{center}
\begin{verbatim}

typedef struct {
    unsigned architecture,
    char     *name,
    unsigned module_id
} processortype;

int init_modules(unsigned      numprocs,
                 unsigned      nummods,
                 processortype proc[numprocs],
                 unsigned      versionnum)
\end{verbatim}
\end{center}

Each processor is described by its architecture class (currently 
ALPHA\_OSF1, IWARP, C90, and T3D are defined), the name of the processor,
and the module it belongs to.  A module contains several consecutively 
numbered processors of the same architecture class.  

The user guarantees that there is no communication in flight when 
\verb.init_modules. is called and that it is called on all the processors 
with the same arguments.   \verb.Init_modules. should verify that 
\verb.proc. does contain \verb.numprocs. processors in \verb.nummods. 
modules and that modules are defined only on contiguous ranges of 
processors.  If any call to \verb.init_modules. fails, they should all fail
with one of the following error codes:
\begin{center}
\begin{tabular}{|l|l|}
\hline
Error code & Description \\
\hline
\verb.INIT_OK. & No Error \\
\verb.INIT_MODS. & Bad Module Description \\
\verb.INIT_PROC. & Bad Processor Descriptor \\
\verb.INIT_OTHER. & Other Error \\
\hline
\end{tabular}
\end{center}
Error codes in the range $[0,\verb.MIN_UNRESERVED_ERR.)$ are reserved.
The definition of \verb.processortype. shown above is version number 1.


\subsection{Communication}

Intermodule communication is accoplished via collective \verb.t_send. and 
\verb.t_recv. calls.  These functions are called by every processor of a
module, and include the source and destination distributions as arguments.  
The communication system is expected to use this information to transport 
the distributed array between the memory areas specified by the sender and 
receiver.  The \verb.t_send.  and \verb.t_recv.  functions shall have the 
following prototypes:
\begin{center}
\begin{verbatim}
int t_send(unsigned target_module, 
           void     *local_array,
           unsigned data_type,
           void     *sender_distribution,
           unsigned sender_distribution_type,
           void     *receiver_distribution,
           unsigned receiver_distribution_type,
           void     *hint,
           unsigned hint_type)
 
int t_recv(unsigned source_module, 
           void     *local_array,
           unsigned data_type,
           void     *sender_distribution,
           unsigned sender_distribution_type,
           void     *receiver_distribution,
           unsigned receiver_distribution_type,
           void     *hint,
           unsigned hint_type)
\end{verbatim}
\end{center}
The target and source module numbers signify which module will be sent to 
and received from, respectively.  \verb.Local_array. points to the first
data item of the distributed array owned by the calling processor.  The
array elements are of type \verb.data_type..  The communication system 
shall recognize at least the following data types:
\begin{center}
\begin{tabular}{|l|l|}
\hline
\verb.data_type. & Description \\
\hline
\verb.DATA_CHAR.  & Character \\
\verb.DATA_UNSIGNED_CHAR.  & Unsigned Character \\
\verb.DATA_INTEGER.  & Integer \\
\verb.DATA_UNSIGNED_INTEGER.  & Unsigned Integer \\
\verb.DATA_LONG.  & Long Integer \\
\verb.DATA_UNSIGNED_LONG.  & Unsigned Long Integer \\
\verb.DATA_FLOAT.  & Single Precision Floating Point \\
\verb.DATA_DOUBLE.  & Double Precision Floating Point \\
\verb.DATA_COMPLEX_SINGLE.  & Complex, Single Precision \\
\verb.DATA_COMPLEX_DOUBLE.  & Complex, Double Precision \\
\hline
\end{tabular}
\end{center}
Data types in the range $[0,\verb.MIN_UNRESERVED_DATA.)$ are reserved.

The distribution of the array on the sending module shall be described by
the data structure that \verb.sender_distribution. points to. 
The distribution type on the sending module is specified by \\
\verb.sender_distribution_type.   The distribution on the receiving 
module is similarly described by \verb.receiver_distribution. and \\
\verb.receiver_distribution_type..  The following distribution types are
currently defined:
\begin{center}
\begin{tabular}{|l|l|}
\hline
\verb.distribution_type. & Description \\
\hline
\verb.DIST_SCALAR. & Scalar Value \\
\verb.DIST_REPL_ARRAY. & Non-distributed Array \\
\verb.DIST_HPF_JMS.  & General HPF \\
\verb.DIST_HPF_JS.  & Restricted HPF \\
\verb.DIST_HPF_PAD.  & General HPF \\
\verb.DIST_UNKNOWN. & Unknown \\
\hline
\end{tabular}
\end{center}
Distribution types in the range $[0,\verb.MIN_UNRESERVED_DIST.)$ are 
reserved.  In a \verb.t_send., the distribution on the receiver may not be 
known.  If this is the case, \verb.receiver_distribution.  can be set to
\verb.NULL., which means that the communication system must discover the
receiver distribution itself.  Similarly, the receiver distribution type 
may be unknown, in which case \verb.receiver_distribution_type.  can be set 
to \verb.DIST_UNKNOWN..  Both the distribution type and the distribution 
may be unknown.  In a \verb.t_recv., the sender's distribution and 
distribution type may be unknown.

\verb.Hint. points to a data structure that provides a hint to the 
communication system.  That hint is of type \verb.hint_type..  Hints are
intended to be defined by communication system designers, although we do not
rule out defining some ourselves.  Hints must not be required for correct 
operation --- programs that use intertask communication may not supply all
possible hints.  Further, the communication system need not use supplied 
hints.  If no hint is known, \verb.hint. shall be set to \verb.NULL. and 
\verb.hint_type. shall be set to zero.  At present, no hints are defined,
although hint types in the range $[0,\verb.MIN_UNRESERVED_HINT.)$ are
reserved.

To communicate a distributed array, every processor in the sending module 
will call \verb't_send'.  Each \verb't_send' call may differ only in the 
\verb'local_array' parameter.  Similarly, each processor in the receiving 
module will call \verb't_recv', with each call's parameters the same, 
except for possibly \verb'local_array'.  Since each processor of a module 
will call \verb't_send' or \verb't_recv', we refer to them as {\em 
collective} calls.  When we talk about a call made by a single processor, 
we will refer to it as a {\em local} call.  Local \verb.t_send.  calls may 
be either blocking or non-blocking.  However, local \verb.t_recv.  calls 
are always blocking.

A collective \verb't_send' and \verb't_recv' are said to {\em match} if 
the sending module's \verb.target_module. is the same as the receiving 
module's module number, the receiving module's \verb.source_module. is the
same as the sending module's module number, and the other parameters of the
calls match, with the exception of \verb.local_array., \verb.hint., and
\verb.hint_type.. The user of these calls guarantees that if more than one array is 
communicated between two modules, the order in which the arrays are sent 
and received are the same.

When a collective \verb.t_send.  and \verb.t_recv.  match, the data in the 
distributed array on the sending module is communicated to the receiving 
module.  After the communication, identical copies of the data exist on 
both the sending and the receiving module.  The distribution of the data on 
the sending module has not changed, and the distribution on the receiving 
module conforms to \verb.receiver_distribution..  Very few restrictions are 
placed on how the communicatiosn system accomplishes this (see Section 
\ref{sec:restrictions}.)

Each local \verb.t_send. and \verb.t_recv. call should return one of
the following error codes:
\begin{center}
\begin{tabular}{|l|l|}
\hline
Error code & Description \\
\hline
\verb.COMM_OK.  & No Error \\
\verb.COMM_PARM.  & Parameter Error \\
\verb.COMM_NODIST & Can't get Dist. or Type \\
\verb.COMM_MEM.  & Out of Memory \\
\verb.COMM_NET.  & Network Failure \\
\verb.COMM_COMP.  & Computation Failure \\
\verb.COMM_NOP.  & Did Nothing \\
\hline
\end{tabular}
\end{center}

\subsection{Deinitialization}

As each processor completes its task in its module, it should call
\begin{center}
\begin{verbatim}
int deinit_module(void)
\end{verbatim}
\end{center}
After every processor has called \verb.deinit_module., the communication 
system can discard all information it has accumulated and every call should 
return.  No error codes are defined at this point.

\subsection{Restrictions}
           
There are very few restrictions.  We require that if ``tagged'' 
communication primitives are used which may also be used by data parallel 
commmunication inside a task, tags must be greater than or equal to
\verb.T_MINTAG..   The reserved ranges for error codes, distribution types,
relation types, hint types, and data types should be heeded.


\section{Distribution Help Interface}

We do not expect the communication system designer to provide support for
all possible data distributions, especially as this can be a quite 
complex task.  Instead, we give the designer the option of adding such 
support, or of calling on {\em distribution help} support functions which
we provide.   These functions come in two forms.  In the first form, 
the functions directly assemble and disassemble messages for each 
sender/receiver processor pair.  In the second form, the functions return 
an {\em address relation} between each sender/receiver processor pair.  The
communication system can use these relations in any way it desires.  

\subsection{Direct Assembly/Disassembly}

The following are the prototypes for the direct assembly/disassembly 
functions:
\begin{center}
\begin{verbatim}
unsigned get_buffer_size(unsigned data_type,
                         void     *sender_distribution,
                         unsigned sender_distribution_type,
                         unsigned sender_processor,
                         void     *receiver_distribution,
                         unsigned receiver_distribution_type,
                         void     receiver_processor)
                         
int      compute_and_assemble(void     *buffer,
                              unsigned buffer_size,
                              void     *local_array,
                              unsigned data_type,
                              void     *sender_distribution,
                              unsigned sender_distribution_type,
                              unsigned sender_processor,
                              void     *receiver_distribution,
                              unsigned receiver_distribution_type,
                              void     receiver_processor)
                                                           
int      compute_and_disassemble(void     *buffer,
                                 unsigned buffer_size,
                                 void     *local_array,
                                 unsigned data_type,
                                 void     *sender_distribution,
                                 unsigned sender_distribution_type,
                                 unsigned sender_processor,
                                 void     *receiver_distribution,
                                 unsigned receiver_distribution_type,
                                 void     receiver_processor)
\end{verbatim}
\end{center}

Assembly and disassembly require a message buffer, a contiguous chunk of 
memory into which the appropriate local array elements are packed.  The 
communication system must allocate one itself, using \verb.get_buffer_size.  
to compute the size needed for the communication between 
\verb.sender_processor.  and \verb.receiver_processor..   It is important
to note that the processor numbers are relative to the first processor
of the task --- if there are $p$ processors in the task, they are numbered
from 0 to $p-1$.

Each processor on the sending module should call 
\verb.compute_and_assemble. for every processor in the destination module, 
send the assembled messages to their destination processors, then 
deallocate the memory.  Each processor on the receiving module should 
receive messages from every processor in the source module, and call
\verb.compute_and_disassemble. for every message.   The communication
system can order the messages in any way it desires, or even avoid 
point-to-point messages --- the \verb.compute_and_assemble. and 
\verb.compute_and_disassemble. functions only perform local copies.

These distribution and distribution type arguments are as described above 
with the important exception that no distribution or distribution type can
be unknown.  This means the communications library must discover this
information before using the distribution help interface.  
The above functions return the following error codes:
\begin{center}
\begin{tabular}{|l|l|}
\hline
Error Code & Description \\
\hline
\verb.COMP_OK.  & No Error \\
\verb.COMP_PARM.  & Bad Parameter \\
\verb.COMP_DIST.  & Unknown or Bad Distribution \\
\verb.COMP_MEM.  & Out of memory \\
\hline
\end{tabular}
\end{center}


\subsection{Address Relations}

The communication system can also request that the distribution help 
system return an address relation instead of assembling or disassembling 
a message.  This is convenient because the communication system may be 
able to cache this relation, or use it to program a scatter/gather DMA device,
for example.  The addresses of an address relation returned by the 
distribution help functions are relative to the starting addresses of the 
first local element on a processor and are element addresses, not byte 
addresses.   To convert to byte addresses, it is necessary to multiply 
them by the size of the array element on the processor and add the 
starting addresses of their correponding local array.  The supported
functions are:
\begin{center}
\begin{verbatim}
int    compute_relation(void     **relation,
                        unsigned *relation_type,
                        void     *sender_distribution,
                        unsigned sender_distribution_type,
                        unsigned sender_processor,
                        void     *receiver_distribution,
                        unsigned receiver_distribution_type,
                        void     receiver_processor)
                        
int    free_relation(void     *relation,
                     unsigned relation_type)
\end{verbatim}
\end{center}

\verb.Compute_relation. computes the address relation between a pair of 
processors.  The distribution, distribution type, and processor number 
arguments are as in the previous section.  The caller specifies the format 
the relation will take by setting \verb.*relation_type.  accordingly before 
the call.  If the caller sets \verb.*relation_type=REL_BEST., permission is 
given to \verb.compute_relation.  to use what it feels is the best format.  The 
following are the currently defined relation types:
\begin{center}
\begin{tabular}{|l|l|}
\hline
\verb.relation_type. & Description \\
\hline
\verb.REL_AAPAIR. & 2-tuples in an Integer Array (pairs) \\
\verb.REL_AABLKB. & Linked list of 3-tuples (blocks) \\
\verb.REL_AASTR. & Linked list of 4-tuples (slices) \\
\verb.REL_MIXED. & Linked list other \verb.relation_types. \\
\hline
\end{tabular}
\end{center}
Relation types in the range $[0,\verb.MIN_UNRESERVED_REL.)$ are reserved.

\verb.Compute_relation. returns a pointer to the relation in 
\verb.*relation., and the format of the relation in 
\verb.*relation_type..  The following error codes are returned:

\begin{center}
\begin{tabular}{|l|l|}
\hline
Error code & Description \\
\hline
\verb.COMP_OK.  & No Error \\
\verb.COMP_PARM.  & Bad Parameter \\
\verb.COMP_DIST.  & Unknown or Bad Distribution \\
\verb.COMP_FMT. & Unknown Format \\
\verb.COMP_MEM.  & Out of memory \\
\hline
\end{tabular}
\end{center}



\section{Distribution Help Registry Interface}

The communication system will access all distribution help functions 
through the distribution help interface.  We will initially support 
various distributions through strictly run--time means.  However, 
compile--time optimizations permit custom address relation computing and/or
message assembling functions to be produced.  For this reason, we permit 
the program to {\em register} such functions for use through the 
distribution help interface.  

\subsection{Custom Function Prototypes}
For custom message assembly/disassembly, the
following function prototypes must be adhered to:
prototype:
\begin{center}
\begin{verbatim}
unsigned custom_buffer_estimate(unsigned data_type,
                                void     *sender_distribution,
                                unsigned sender_processor,
                                void     *receiver_distribution,
                                void     receiver_processor)

int      custom_msg_assemble(void     *buffer,
                             void     *localarray
                             unsigned datatype,
                             void     *sender_distribution,
                             unsigned sender_processor,
                             void     *receiver_distribution,
                             void     receiver_processor)
                              
int      custom_msg_disassemble(void     *buffer,
                                void     *localarray
                                unsigned datatype,
                                void     *sender_distribution,
                                unsigned sender_processor,
                                void     *receiver_distribution,
                                void     receiver_processor)
\end{verbatim}
\end{center}
For custom address relation computation, the following prototypes are 
required:
\begin{center}
\begin{verbatim}
int    custom_compute_relation(void     **relation,
                               unsigned *relation_type,
                               void     *sender_distribution,
                               unsigned sender_processor,
                               void     *receiver_distribution,
                               void     receiver_processor)
                        
int    custom_free_relation(void     *relation,
                            unsigned relation_type)
\end{verbatim}                     
\end{center}                             

\subsection{Registration}

Custom functions are registered for a pair of unreserved sender and receiver 
distribution types:
\begin{center}
\begin{verbatim}
int register_custom(unsigned  sender_distribution_type,
                    unsigned  receiver_distribution_type,
                    int       (*custom_buffer_estimate)(),
                    int       (*custom_msg_assemble)(),
                    int       (*custom_msg_disassemble)(),
                    int       (*custom_compute_relation)(),
                    int       (*custom_free_relation)())
\end{verbatim}
\end{center}
where the registered custom functions should have the prototypes 
described above.  Either the three assembly/disassembly functions can be
registered, or the two relation computation functions can be registered, or 
both can be registered.  If either set is not registered, \verb.NULL. 
must be used as the argument to \verb.register_custom..


\section{Distribution Tools}

\subsection{Copy tools}

\subsection{Relation tools}


                              

\end{document}
