                                                              March, 1994


                         AUIS Porting Guidelines

Manufacturers continue to produce new hardware/operating system
platforms on which people would like to utilize AUIS, the Andrew User
Interface System.  This document describes some of the tasks essential
in doing a port.

Throughout, a file name beginning with ./ refers to a file in the AUIS
source distribution.  Installed files are referred to as if they are
installed in /usr/andrew, though a site may install the tree elsewhere.  

In general, the term <platform> below refers to a name chosen to refer
to the platform.  Often it is the name used by Transarc to identify the
platform and therefore available as the `sys` command or the value of
@sys as a path component.  Usually the name is a compound listing first
the hardwar and then the operating system;  for example, the <platform>s
for the IBM RT/PC are rt_aix221, rt_ aos4, and rt_mach.  The platform
name must be #defined as the value of the variable SYS_NAME in
./config/<platform>/system.h.

________________________________________________________
General rules regarding porting AUIS to a new platform:

    Make as few changes as is possible; the fewer additional characters,
    lines and files, the better.

    Try to restrict your changes to:
        ./config/<platform>/system.{h,mcr}
        ./overhead/class/machdep/<platform>/...
        ./atk/console/stats/<platform>/...

________________________________________________________
The config Directory

The config directory contains the imake configurations files, including
the imake "rules" files, the .mcr files, and the .h files.  These are
used to generate Makefiles from Imakefiles and to produce the andrewos.h
include file which is included in all source code files.

When generating a Makefile from an Imakefile, the file
./config/imake.tmpl is preprocessed.  It defines various Makefile
variables and has #includes for
	platform.tmpl   (which #includes system.mcr)
	andrew.rls  (which #includes site.rls)
	the-Imakefile-itself
Platform.tmpl is responsible only for determining what platform this is.
 It then #includes the system.mcr file for the platform.  
System.mcr defines Makefile variables with '='.  It #includes system.h,
allsys.mcr, and site.mcr.  A flag is set so the system.h does not
#include header files for the operating system.
System.h #defines various preprocessor variables and also #includes
allsys.h, various operating system include files, and site.h.
Site.mcr and site.h should be empty in a distribution.  They can be
utilized at a given site to override definitions given in the other
files.
The andrew.rls files (and site.rls) #define preprocessor macros which
are invoked in the Imakefile.  We do not expect you to add new rules; 
if you feel you must, please contact the Consortium staff.

Each .c source file includes andrewos.h, which does nothing more than
#include system.h, but this time the flag is not set and the various
operating system header files are #included.  The latter include
approprately named versions of 
		types.h, syslog.h, unistd.h, signal.h, param.h, fcntl.h, file.h
		dirXXX.h, stringX.h, timeX.h

When adding support for a new platform, you must (1) create a new
subdirectory of ./config whose name identifies the new system type, (2)
populate that new directory with a system.h and system.mcr file, and (3)
add the proper "vendor block" to the file ./config/platform.tmpl, so
that the new files are used under the proper platform.  

 == Adding New Config Files

Suppose you are adding support for a variant of UNIX, called SPINIX,
that runs on the Intel 80386 chip.  You typically create the new
directory config/spinix_i386 and populate it with system files borrowed
from an already-ported i386 system, like sco_i386.  Then add a test in
./config/platform.tmpl to check for a cpp symbol unique to SPINIX, so
that the correct files are used:

    #if defined(i386) && defined(spinix)
    #include <i386_spinix/system.mcr>
    #define MacroIncludeFile i386_spinix/system.mcr
    #endif /* Spinix */
 
With the above "bootstrap" code, it is assume that the two symbols
"i386" and "spinix" are defined by the cpp.  You can generally determine
what symbols are exported by cpp by running the strings command on the
cpp executable:

	% strings /lib/cpp | more

.. or, if you use gcc, use the verbose flag (-v) to see what arguments
gcc passes to cpp:

	% touch foo.c; gcc -v -o foo foo.c
The symbols defined for the compilation are given after -D on the line
showing the call to cpp.

== What goes in a system.h file

The system.h file initializes the cpp state seen by each source module
and each Imakefile.   Each system.h file has this general form:

	#include <allsys.h>
	XXXX
	#ifndef In_Imake
	XYXY
	#endif /* In_Imake */
	XXXX
	#include <site.h>

The token XXXX denotes that arbitrary #defines are used here.  XYXY
denotes that a combination of #defines, additonal file-inclusion
directives and C declarations may be used here.  The contents of the
whole file is available to source modules via inclusion of the header
andrewos.h while only the XXXX sections are included in the final
Makefiles. In the config header files you should *only*: 
	1) define or redefine cpp symbols, 
	2) #include additional header files
		(protected by #ifndef In_Imake if they include arbitrary C constructs)  
	3) Insert C declarations, if absolutely necessary
		(again, place them between #ifndef In_Imake and #endif /* In_Imake */)

== What goes in a system.mcr file

Tthe system.mcr file is included in all Makefiles, after being processed
by cpp.  Macros defined in system.mcr are related to such issues as,
"where is the X11 bin directory" and "what are the debug flags to be
passed to the compiler."

	XBINDIR = /usr/local/bin
	CDEBUGFLAGS = -g

In the config macro files, you should *only* set the values of various
Make variables.  It is OK to set these Make macros based on the state of
the cpp, for instance:

	#ifdef POSIX_ENV
	STD_DEFINES = -D_POSIX_SOURCE -D_ALL_SOURCE
	#endif


________________________________________________________
CLASS -- ./overhead/class/machdep

The machine-dependent class directory implements the dynamic loading
feature of the class system in whatever way is proper for a given
platform.  

== An overview of the dynamic loading implemented by Class

Dynamic loading is a process whereby the code that implements a
particular class is automatically loaded into a running process upon the
first call to one if its classprocedures.  For instance, if I have a
multimedia filesystem browser it is likely that during the course of
browsing I will run across multimedia compound documents containing
references to objects of arbitrary classes.  If those classes have
already been used within the process, then a description of them exists
and instances can be created.  If the class description is not in the
current process, the class has not been previously used and the system
needs to find the proper "dynamic object" to load, to obtain a
description of the class so that a copy can be instantiated.  

The Andrew Class system is fairly complicated but can be described by
the following points:

A class-header file (.ch) contains an ASCII description of a class in
the Class preprocessor language.  The Class Preprocessor translates a
class-header file into a class-import file (.ih) and a class-export file
(.eh) that are included by various source modules.  The class-export
file should be the last file included by the defining module or modules.
 The class-import file is included by any module explicitly using the
class.  The class-export file contains support routines for the class
implementation.  Each method or classprocedure for a class is initially
defined as being one of an array of "jumper" functions that are defined
in the Class library.  The entry points for these "jumper" functions are
defined by an assembly module located in the machine-dependent class
directory, called entry.spp.  It is the purpose of the jumper routines
to remember which original classprocedure was called, load the
approprate module if it hasn't already been loaded, and call the
original classprocedure.  These jumper must be implemented in assembler
because (1) it's important that the stack and registers be cleaned up
such that it looks like the classprocedure was called directly and not
through the jumpers and (2) declaring the jumpers as C functions limits
the number and types of arguments that a classprocedure can have.  Once
a class description has been loaded into the running process, the
remaining classprocedures and methods are re-defined to be their
original routines, and the jumper assembly entry points are no longer
used.

The rest of the dynamic loading system in Class is dependent on what
services are offered by the OS.  Newer OS's provide shared library
support.  A shared library is similar to a dynamically loaded class in
that the code in the shared library is not attached to the process until
it is actually called for.  Even when it is attached, the text portion
of the library code is shared among all the processes using the library
while each process has it's own copy of the data segment.  On systems
that support shared libraries (IBM RS/6000, HPUX, SunOS) the jumper
routines are still needed but the task of actually loading the code into
the running process is greatly simplified as it is provided by a system
call.  In more archaic OS's (BSD), the task of actually loading an
external object module is complicated and calls for additonal steps.  

== Class with Shared Libraries

On platforms that support shared libraries, system calls are provided
that make it possible to attach a module to the running process and to
lookup the address of symbols within that attach module.  On SYSV
platforms the set of system calls are dlopen, dlsym, dlerror, and
dlclose.  The modules that can be manipulated by these shared library
services must be compiled such that they contain no symbols that
absolutely reference other elements of the object file.  This type of
module contains what is called position-independent code (pic).  On the
RS/6000, all code generated by the stock compiler is pic by default. 
Symbol references are all made through a table of contents (TOC) -- one
TOC is maintained for each object file.  On other platforms you must
specify to the compiler that pic is desired by setting the PICFLAG macro
in that platform's system.mcr file.  It would be nice if we could just
compile each and every module as position-independent but there are
compiler & runtime limits under certain platforms that prevent that
simple-minded solution.  One notable example of this problem is under
SunOS4.1.x. where we must compile both pic and non-pic of each module to
avoid a stub table overflow.  For this platform, each directory that
contains source also contains a subdirectory called "shared" where the
pic versions of the modules reside.  Under this scheme, when a dynamic
object is created by makedo, it's underlying object modules are
retrieved from the shared directory while that non-pic versions are used
for archive libraries.  This results in a larger build area but ensures
that relevent compiler/linker limits are not reached.

== Class without Shared Libraries

On platforms that do not support shared libraries, the symbols found in
a module to be dynamically loaded must all be manually relocated in the
running processes address space.  This is a very complicated matter as
it gets into the details of the object file format for a particular
operating system.  Luckily there are some defacto-standards for object
file formats: a.out, COFF, XCOFF, and ELF.  Of these, only a.out is any
trouble because the others come from OS's that support shared libraries.
 In the machine-dependent class directory, there are several
implementations of symbol relocation for a.out format (see 
./overhead/class/machdep/aos_rt/class.c).

There are certain routines in the C library that rely on local data. 
Malloc is one of these routines.  The code for these routines must be
gathered up and linked into the main process (called runapp) and also
these routines must be excluded from the dynamic objects.  The rest of
the routines in the C library must be wrapped up and put into an
auxillary library known as libcx.a.  It is libcx.a that is linked
against the object files that implement a class in the machine-dependent
file makedo.csh.  Makedo's purpose is to run the link-editor (aka: ld)
on the the object files that implement a class, explicitly specifying
the resultant file's entrypoint, and resolving any external references
to symbols in either libcx.a or the list of globals that will be found
in runapp.  This list of global symbols and libcx.a are both created in
the machine-dependent Imakefile via various nm-related hacks that are
very ugly.

When a class is loaded (attached to a running process), it's entrypoint
is looked up and called.  This entrypoint is a routine called
<classname>__GetClassInfo.  This routine is defined in <classname>.eh,
the class's export-header file, and it's purpose is to return a pointer
to a structure that contains information about the classes superclass,
an array of pointers to the class's methods and classprocedures, and the
classes instance data.  The structure returned from GetClassInfo is, in
effect, an instance of the class.  Next, the classes Initializers are
called.  The first time a class is loaded it's InitializeClass routine
is called to setup data common to all instances of the class.  Then the
InitializeObject method is called, to initialize the object's instance
data.

________________________________________________________
Console Statistics -- ./atk/console/stats/<platform>/getstats.c

Getstats runs as a separate process extracting information from the
kernel and passing it to the 'console' application.  (A separate process
is needed because it must be setuid to root on some platforms.) 
Information reported includes disk occupancy (% full) and a host of
statistics from /dev/kmem including CPU load, I/O load, and VM occupancy.

Getstats extracts info at fixed intervals;  for each cycle and each
datum, it prints a line to stdout.  The parent program (Console, or any
other program which wishes to get at this information) is responsible
for setting up a pipe, binding the child's (this program) stdout to one
end of a pipe, and parsing the strings which are passed back.

The format of each line is an ID (int), a colon, a value (int), and
optionally another colon followed by a string.  The ID is coded from the
included file "getstats.h".  

The main program in ./atk/console/stats/common/gsmain.c calls on three
functions which must be defined in
./atk/console/stats/<platform>/getstats.c:

    InitGVMStats()
    Do any required initialization.  Typically, this means reading the
    namelist associated with the kernel to find the locations of the
    information.

    GetGVMStats(int UsersID)
    Send lines to stdout for each relevant statistic in
    ../../lib/getstats.h.  The UserId canbe used to distinguish
    processes belonging to this user from other users.

    GetDiskStats(int Init)
    If Init is TRUE, send lines to stdout giving the mount point of each
    disk.  Otherwise send lines giving the percent of each idsk which is
    occupied.  The ID in these messages will be DISK1, DISK2, ... .

________________________________________________________
Portability Macros

We have introduced a few portability macros in order to keep the sources
as clean as possible and to ease the burden of porting the sources to a
new platform.  

NEWPGRP

    When a program forks off another process, the child often is
    disconnected from its parent's process group so that signals
    generated by the child aren't received by the parent.  On
    BSD-derived platforms, a process can make itself the process group
    leader by issuing the following system call:

        setpgrp(0, getpid());

    while on sysv derived platforms (including all POSIX systems we have
    encountered), it is done this way:

        setpgrp();

    Now there is a macro called NEWPGRP() that is to be used for this
    purpose.  The macro is defined in the system.h platform-dependent
    config file. 

DIRENT_TYPE
DIRENT_NAMELEN(dirEnt)

    To open a directory and read it's entries:

        #include <andrewos.h>

        char *dirName;
        DIR *dir;
        DIRENT_TYPE *dirEnt;

        if(dir = opendir(dirName)) {
                while (dirEnt = readdir(Dir)) 
                        printf("%s\t%d\n", dirEnt->d_name,
        DIRENT_NAMELEN(dirEnt));
        }

    DIRENT_TYPE:                a portability type that represents a
    directory entry
    DIRENT_NAMELEN:     a portability macro that takes a (DIRENT_TYPE *)
    and returns the length of the directory entry name string

SIGSET_TYPE

FILE_HAS_IO

MANDIR
TMACMANFILE
TMACPREFIX

MIN
MAX


________________________________________________________
System definition variables

In addition to the variables described in ./README (and ./README.ez),
the following may be #defined in ./config/<platform>/system.h file

    #define LIBDL_ENV 1
    This variable should be defined iff dynamic loading is to be
    implemented with the SunOS or SYSV dlopen interface.  (See
    ./overhead/class/machdep/sun_sparc_{41,51})

    #define HAVE_SHARED_LIBRARIES 1
    HAVE_SHARED_LIBRARIES apparently influences how the X library is
    linked into dynamic objects or programs which load dynamic objects. 
    Probably by linking with a full path when it is not defined, and
    with -L$(XLIBDIR) -lX11 when it is defined.

    #define NO_SHARED_DIR  1
    When not defined every C file will be compiled twice, and one
    compiled with the contents of $(PICFLAG) will be placed in a
    directory called shared.   When defined every file will be compiled
    once with the contents of $(PICFLAG) and no shared directory will be
    used.

Settable in ./config/<platform>/system.mcr:

    PICFLAG = 
    $(PICFLAG) should have cc switches appropriate to compiling a .c
    file for inclusion in a shared library or dlopenable object. 
