PROGRAMMING IN THE AUTON PROJECT AND SCHENLEY PARK RESEARCH

CONTENTS
INTRODUCTION
SIGNALLING ERRORS
WAITING FOR A KEY PRESS
RANDOM NUMBERS
MEMORY ALLOCATION
VECTORS AND MATRICES
----- Important conventions for using dyms.
----- Basic operations on a dym (DYnamic Matrix)
----- Numerical operations on a dym (DYnamic Matrix)
----- DYVs (DYnamic Vectors)
----- Transformations between dyvs , dyms , 1-d arrays of doubles and 2-d arrays of doubles.
----- Making small dyvs and dyms
----- Complex matrix and vector operations
Integer vectors (type ivec)
String Arrays (type string_array)
Hash Tables

INTRODUCTION

We use a specific style of C programming, which we call "The Allegheny Method" (AM). The basic idea is to put a few standards in place so that a group of us can follow each other's code and update it without any surprises.

The most important thing is simplicity and understandability of code. Also important is modularity. The main formalism we use to achieve this is the "Abstract Data Type" style of programming, which could also be known as "Very Simple Object Programming Without The Fancy Little Crinkles."

Our code will be run on many platforms under many operating systems and using many compilers. There must be nothing system-specific in the code.

SIGNALLING ERRORS

      #include "amut.h"

Call

      void my_error(char *message)

This procedure never terminates, but ends the execution of the program and delivers the message. On some systems it offers the opportunity to enter the debugger.

WAITING FOR A KEY PRESS

      #include "amut.h"

Call

      void wait_for_key()

When running on a console, prompts the user to hit a key. When running under a GUI brings up a dialog box. Words like "modal" and "preemptive" are probably involved.

RANDOM NUMBERS

      #include "amut.h"

We want the same sequence of random numbers for a given seed no matter which system we're on. The following random library is the one to use (thanks, Justin!) it passes randomness tests pretty well.

      void am_srand(int seed)

seeds the random number generator

      void am_randomize()

uses time of day based seed

      int int_random(int n)

returns a random integer between 0 and n-1 inclusive

      double range_random(double low, double high)

returns a uniform random double between low and high inclusive

      double gen_gauss()

gaussian variable, mean 0, variance 1

There exist other random functions, e.g. creating a random direction or random rotation matrix. See stats.h in the utils library.

MEMORY ALLOCATION

      #include "amut.h"

We have our own versions of malloc and free. We use lots of allocating and freeing in the Allegheny Method style of programming, so we need it to be very fast and easy to debug. There are options for helping find and report memory allocation errors and memory leaks.

      char *am_malloc(int size)
      void am_free(char *memory, int size)

Note you must tell us what size you are freeing. This is primarily as a check: in memory debug mode, am_free will check that the thing you are freeing was am_mallocked with the size you claim.

Instead of using the above you will almost always want to use the following macros:

      <type> *AM_MALLOC(<type>)

malloc something of type <type> and returns a pointer to it

      void AM_FREE(<type> *x, <type>)

frees x which we are told is of type <type>

      <type> *AM_MALLOC_ARRAY(<type>, int array_size)
      void AM_FREE_ARRAY(<type> *x_array, <type>, int array_size)

Reporting Memory usage:

      void am_malloc_report()

put this at the very end of your program. It will warn you if anything remains unfreed. It will also tell you what to do in order to trace the leak.

VECTORS AND MATRICES

      #include "amut.h"

Dynamic Matrices and Vectors.

The following routines are not clever, just sensible, efficient implementations of dynamic matrices. Here's an example of their use:

      #include "amdm.h"
      
      FILE *s = fopen(fname,"r");
      dym *dm = read_dym(s,NULL,NULL);  Creates a dynamic matrix from ascii file
      dym *dmt = mk_dym_transpose(dm);  Creates its transpose
      dym *a = mk_dym_mult(dm,dmt);     Creates a square matrix
      invert_dym(a,a);                  Invert a and store the result in a.
                                        (No dynamic memory allocation... to
                                         dynamically make the inverse, do
                                         dym *ainv = mk_invert_dym(a) )
      fprintf_dym(stdout,"(d d$t)^-1",a,"\n");
      
      fclose(s);
      free_dym(dm);
      free_dym(dmt);
      free_dym(a);

Important conventions for using dyms.

Dyms are addressed by (row,column) pairs, 0 <= row < num-of-rows and 0 <= col < number-of-columns.

Some of the dym functions dynamically allocate memory. All such functions have names beginning with mk_

Two of the dym functions free memory. They have names beginning with free_.

All other named functions simply copy data between already-allocated memory.

Many functions have a "mk_" form and a conventional copy form. The conventional form places the result in another argument of the call (which must have been pre-initialized to have the correct number of rows and columns). The name of an argument which gets stuff copied into it usually begins with r_ . The mk_ form creates an appropriately-shaped dym, computes the result, and returns the new dym.

For example, here I explain two functions. The "copy" form of matrix addition which copies the sum of d_1 and d_2 into r_d. And the "mk" form dynamically creates and returns a matrix in which d_1 + d_2 are stored.

      void dym_plus(dym *d_1, dym *d_2, dym *r_d)
      dym *mk_dym_plus(dym *d_1,dym *d_2)

Adds two matrices.

Notes on using dynamic matrices: Remember that you should always free any dynamic matrices you create. Otherwise, their memory will remain allocated for the rest of the time the program runs, which may mean you run out of memory unexpectedly early.

The functions below all obey what we'll call the "matrix memory convention" This means that the arguments to the functions are not directly changed, or have any part over-written, except for the result matrix, which is usually the last argument to the function, called "r_mat". Furthermore it is legal, and fine, to call the function with the matrix to be changed as one of the input arguments. For instance, to double the values inside a matrix defined by matrix *a you could call

      dym_plus(a,a,a)

and after calling each element of a would be twice its normal value.

      dym_mult(a,a,a)

squares a etc.

Basic operations on a dym (DYnamic Matrix)

      dym *mk_dym(int rows, int cols)

Dynamically allocates the dym. The initial values are undefined.

      double dym_ref(dym *dm,int i,int j)

Returns the (i,j)'th element of the matrix. May be implemented as a macro to improve speed. Indexing begins at zero: the jth element of the top row is dym_ref(dm,0,j), the ith element of the leftmost column is dym_ref(dm,i,0).

      double dym_set(dym *dm,int i, int j, double value)

Sets the (i,j)'th element of the matrix to be value. See dym_ref for the indexing convenion.

int dym_rows(dym *dm) gives the number of rows in the dym

int dym_cols(dym *dm) gives the number of cols in the dym

Numerical operations on a dym (DYnamic Matrix)

      void constant_dym(dym *r_d,double value)
      dym *mk_constant_dym(int rows,int cols,double value)

Sets all elements to be value

      void zero_dym(dym *r_d)
      dym *mk_zero_dym(int rows,int cols)

Sets all elements to be 0.0

      void dym_scalar_mult(dym *d, double alpha, dym *r_d)
      dym *mk_dym_scalar_mult(dym *d,double alpha)

Multiplies all elements of d by alpha.

      void dym_scalar_add(dym *d, double alpha, dym *r_d)
      dym *mk_dym_scalar_add(dym *d,double alpha)

Adds alpha onto all elements of d.

      void copy_dym(dym *d, dym *r_d)
      dym *mk_copy_dym(dym *d)

Creates a copy of the dym.

      void dym_plus(dym *d_1, dym *d_2, dym *r_d)
      dym *mk_dym_plus(dym *a,dym *b)

Adds two matrices.

      void dym_subtract(dym *d_1,dym *d_2,dym *r_d)
      dym *mk_dym_subtract(dym *a,dym *b)

Subtracts a matrix.

DYVs (DYnamic Vectors)

dyvs are just one dimensional matrices. You can conveniently allocate and free them too using the same conventions as for dynamic matrices.

      dyv *mk_dyv(size)

CREATES and returns one of the fellows.

      int dyv_size(dyv *dv)

returns the number of elements in the dyv

      double dyv_ref(dv,i)

returns the value of the i'th element. Indexing begins at 0: 0 <= i < dyv_size(dv).

All the numerical operations on dym's have counterpart numerical operations on dyv's:

      void constant_dyv(dyv *r_d,double v)
      void zero_dyv(dyv *r_d)
      dyv *mk_constant_dyv(int size,double v)
      dyv *mk_zero_dyv(int size)
      void dyv_scalar_mult(dyv *d, double alpha, dyv *r_d)
      dyv *mk_dyv_scalar_mult(dyv *d,double alpha)
      void dyv_scalar_add(dyv *d, double alpha, dyv *r_d)
      dyv *mk_dyv_scalar_add(dyv *d,double alpha)
      void copy_dyv(dyv *d, dyv *r_d)
      dyv *mk_copy_dyv(dyv *d)
      void dyv_plus(dyv *d_1, dyv *d_2, dyv *r_d)
      dyv *mk_dyv_plus(dyv *a,dyv *b)
      void dyv_subtract(dyv *d_1,dyv *d_2,dyv *r_d)
      dyv *mk_dyv_subtract(dyv *a,dyv *b)

void dyv_sort(dyv *dv,dyv *r_dv); It is fine, as always, if dy and r_dv are the same.

      dyv *mk_dyv_sort(dyv *dv)

Transformations between dyvs , dyms , 1-d arrays of doubles and 2-d arrays of doubles.

We define dym = DYnamic Matrix dyv = DYnamic Vector farr = our notation for an array of doubles. If a is declared by double *a or double a[100], then a is a farr tdarr = our notation for a 2-d array of doubles IMPLEMENTED array of POINTERS to array of doubles. For, example, something created by am_malloc_2d_realnums(rows,cols). BUT NOT SOMETHING DECLARED BY a[100][30] which is implemented as a blocking of 3000 doubles by the C compiler.

There are a huge number of functions to copy, and make items of one of the above types into items of another of the above types.

You can do the obvious copying between the following types:

      Source -> farr    dyv     tdarr     dym
      Dest |
      |
      V
      
      farr               -      y        -       (y)
      dyv                y      -       (y)      (y)
      tdarr              -     (y)       -        y
      dym               (y)    (y)       y        -

The (y)'s in parentheses have two kinds: copying to/from a column of a 2-d structure to a 1-d structure or copying to/from a row of a 2-d structure to a 1-d structure.

Thus, lets make a more complete table for 1-d copying and making:

      Source -> farr    dyv    tdarr_row   tdarr_col dym_row dym_col
      Dest |
      |
      V
      
      farr              -       y        -           -        y       y
      dyv               y    copy_dyv    y           y        y       y
      tdarr_row         -       y        -           -        -       -
      tdarr_col         -       y        -           -        -       -
      dym_row           y       y        -           -        -       -
      dym_col           y       y        -           -        -       -

And for 2-d:

      Source -> tdarr    dym
      Dest V
      tdarr            -       y
      farr             y     copy_dym

Including all the combinations of copying and making copies of, there are at least 50 such functions. To make life simple, theres are strong naming convention for all such functions:

to copy from a <Source> x, to a <Dest> y, call

      copy_<Source>_to_<Dest>(x,y)

e.g;

      double y[10];
      dym *x = mk_constant_dyv(6,1.0); / Creates (1.0,1.0,1.0,1.0,1.0,1.0) /
      copy_dyv_to_farr(x,y);    /  Fills first 6 entries in y with 1.0  /

to make a new thing of type <Dest> from a thing of type <Source> do

      mk_<Dest>_from_<Source>(x)

e.g;

      dym *x = mk_constant_dym(2,3,9.0);
      /  Creates ( 9.0 , 9.0 , 9.0 )
      ( 9.0 , 9.0 , 9.0 )   /
      double *y = mk_tdarr_from_dym(x);    /  Creates

BUT: Some functions need integer numerical arguments too. These always occur after the source and dest argument.

To make a dyv from a farr you need to say how long the farr is, e.g. y = mk_dyv_from_farr(x,6) says to make a dyv of size 6 from the first 6 elements of the array of doubles x. y = mk_dyv_from_dym_row(x,4) says to make a dyv from the the row indexed by 4 (which is of course the fifth row from the top) of the dym x.

Making small dyvs and dyms

      dyv *mk_dyv_1(double x)

makes a 1-element dyv containing x as its only element

      dyv *mk_dyv_2(double x,double y)

makes a 2-element dyv containing x as its 0th-indexed element y as its 1-index element

      dyv *mk_dyv_3(double x,double y , double z)

.... obvious.

And also you can make any of the 9 smallest matrices thusly:

      dym *mk_dym_RC( ... )

where R and C are each either 1 , 2 or 3

and there are R * C arguments, which initialize (in lexical order x00, x01, .. etc) the dym. E.G:

      dym *d = mk_dym_32(1.0,2.0,3.0,4.0,5.0,6.0)

allocates memory and creates a dym =

      [ 1.0 2.0 ]
      [ 3.0 4.0 ]
      [ 5.0 6.0 ]

Complex matrix and vector operations

See amdm.h for information about additional functions, including:

dyv_scalar_product, dyv_dsqd, dym_times_dyv, mk_dym_times_dyv, dym_mult, mk_dym_mult, dyv_outer_product, mk_dyv_outer_product, dym_transpose, mk_dym_transpose, dym_scale_rows, mk_dym_scale_rows, dym_scale_cols, mk_dym_scale_cols, dym_svd, dym_determinant, dym_solve_vector, mk_dym_solve_vector, invert_dym, mk_invert_dym, is_dym_symmetric, attempt_cholesky_decomp, is_dym_symmetric_positive_definite, fprintf_dym, fprintf_dyv, dyv_outer_product, mk_dyv_outer_product, dym_transpose, dym_scale_rows, mk_dym_scale_rows, dym_scale_cols, mk_dym_scale_cols, dyv_mean, dyv_sdev, dym_min, dym_max, dyv_min, dyv_max, dyv_argmin, dyv_argmax, mk_dyv_1, mk_dyv_2, mk_dyv_3, mk_dym_11, mk_dym_12, mk_dym_13, mk_dym_21, mk_dym_22, dym mk_dym_23, mk_dym_ptq, mk_dym_transpose_times_dyv, mk_identity_dym, mk_dym_from_string, mk_nrecipes_matrix_from_dym, mk_nrecipes_vector_from_dyv

Integer vectors (type ivec)

String Arrays (type string_array)

      #include "amiv.h"

A string_array is a variable length, take-care-of-its-own-allocating- and-deallocating array of strings. It is often handy to have such a thing and in some cases it takes the place of Lisp Lists of Symbols. We'll first look at the elementary operations on a string_array and then we'll see some of the major utility functions in which it is used.

Note that entries in a string array must be either NULL or else 0-terminated character strings.

      string_array *mk_string_array(int size);

Makes a string array of length "size" with each element the NULL string.

      void free_string_array(string_array *sar);

Frees the string array and, of course, all its contents.

      string_array *mk_copy_string_array(string_array *sa);

Copies the string array and, of course, all its contents.

      void fprintf_string_array(FILE *s,char *m1,string_array *sar,char *m2);
      void fprintf_string_array_contents(FILE *s,string_array *sar);

      char *string_array_ref(string_array *sar, int i);

Returns a pointer the the i'th element of the string array. Returns NULL if the i'th element is NULL. Does not allocate any memory for this. The caller must not update the string returned. The user must not free the string returned. The caller must not do anything, even look at, the string returned after the owning string array has been freed or after string_array_set(sar,i,XXX) has been called (in either case the string will have been freed).

      void string_array_set(string_array *sar,int i,char *value);

Frees the i'th element in sar, and then COPIES in the new string. If the new string is NULL, that's fine.

      int string_array_size(string_array *sar);
      string_array *mk_string_array_from_array(char **sarr,int size);

Takes a boring old C array and turns it into a yumliscious string_array.

      void add_to_string_array(string_array *sa,char *string);

Dynamically (and efficiently) increases the length of the string array by one, and COPIES in string at the new rightmost array value.

      void string_array_add(string_array *sa,char *string);

This function is synonym for add_to_string_array. Not the preferred name. Retained for compatibility.

      string_array *mk_broken_string(char *string);

whitespace is removed, and each word is placed as a separate entry into the string array. If string is NULL or "" or has only whitespace, a string_array of length 0 is returned.

      char *mk_string_from_last_n(string_array *sa,int n);

Appends together the last n strings in sa into one big string, which is AM_MALLOCKED and returns. A single space is placed between each string.

      char *mk_string_from_string_array(string_array *sa);

Appends together all the strings in sa into one big string, which is AM_MALLOCKED and returns. A single space is placed between each string.

      char *mk_string_from_line(FILE *s);

Inputs a line of text from the stream until a carriage return ('\n') character is found. Makes a 0-terminated string out of what it finds. Does not include the ending \n character in the string. If the file has ended, returns NULL. If no characters on the line returns "" (i.e. an array of chars with one element that has value '\0').

      string_array *mk_string_array_from_line(FILE *s);

As above, except it makes an array instead of a single string, having broken up into substrings according to whitespace.

      int find_index_in_string_array(string_array *sa,char *string);

Finds least i s.t. i >= 0 and string_array_ref(sa,i) has the same printed appearance is string (i.e. such that strcmp would judge them the same). If no such i returns -1.

      bool string_array_member(string_array *sa,char *string);

True if and only if for some i string_array_ref(sa,i) has the same printed appearance as string.

Hash Tables

These are defined in hash.c and hash.h

      #include "hash.h"

A HASH is a data structure that represents an association mapping strings to data objects.

Addition of (key,value) is a constant-time operation (implemented with hashing of key).

Query for the data associated with a given key (which returns NULL if key not stored) is also constant time.

      hash *mk_hash(int array_size,
                    void (*free_data)(char *data),
                    char_ptr (*mk_copy_data)(char *data));

Creates a hash in which data objects (the "value" in "key,value pair" are freed with free_data and have copies made of them (and all their contents) with mk_copy_data.

      char *hash_lookup(hash *h,char *key);

Returns NULL if can't find answer Constant time operation.

      void remove_from_hash(hash *h,char *key);

No effect if key not found Constant time operation.

      void add_to_hash(hash *h,char *key,char *data);

Puts a COPY of key and data into the hash table. If the key already exists in the hash table, removes the old key and old data before putting the new ones in. Constant time operation.

      void free_hash(hash *h);

Frees the hash table and all its contents Constant time operation.

      void fprintf_hash(FILE *s,char *m1,hash *x,char *m2);

      hash *mk_copy_hash(hash *h);

Copies the hash tab;le and all its contents Constant time operation.

      int hash_size(hash *h);

Number of key-data pairs in hash table