Auton/SPR Coding standards

CONTENTS

Easy-to-Use Code
The Bottom Line
Abstract Data Type programming
----- ADT philosophy.
Standard ADT functions
Quick Guide to useful ADTs already around.
Fine Points of Style
----- CapiTaLiZAtion, braces, indenting.
----- Header file conventions
----- utils/standard.h
----- Simple types

Easy-to-Use Code

There are many things to care about when writing code. For the Auton and SPR work, the order of importance is

      1. Ease of use of each other's libraries
      2. Ease for a new programmer to get involved
      3. Portability
      4. Ease of understanding the code
      5. Useful Efficiency
      6. Useful Modularity
      7. Number of occurences of the words "foo" and "baz" in the code
      8. Cuteness
      9. Programming Style Ingenuity Demonstration
      10. Useless Efficiency

The Bottom Line

We program in ANSI C. Some of us program in C++, but for directly applicable Auton code, or for SPR projects, please use ANSI C and not C++. The code must be very portable, and certainly must run on Unix and Windows 95, Windows NT. If the following conventions are used, and you use the UTILS (and DRAW if doing graphics) libraries for your I/O this won't be a problem for you.

Abstract Data Type programming

The main formalism we use to achieve ease of use is the "Abstract Data Type" style of programming, which could also be known as "Very Simple Object Programming Without The Fancy Little Crinkles." ADTs give most of the regular software engineering benefits you can think of, except that the joys and dangers of inheritence aren't directly available. The lack of using inheritence makes Andrew feel a little bit worried late at night, but he usually gets to sleep anyway.

ADT philosophy.

If you're interested, see the first few chapters of "Object Oriented Software Construction" by Meyer (you can borrow this from Andrew).

In the ADT philospohpy, a programmer who decides to create a bit of software, first decides what are the one or two (maybe three or four, sometimes) primary datastructures they'll create.

Then they decide which operation(s) can CREATE one of these structures. The most common way they can be created is as a function of some other, perhaps more primitive data structures. And/Or being read in from a file. And/Or coming from the user.

Whenever possible, we strongly recommend that a creation function dynamically allocates memory for the new instance of the ADT, and COPIES in all necessary subcomponents. See later. These creation functions should RETURN the newly created instance, and their name should begin with "mk" or "mk_".

Then they write a FREE function to, er, free an instance.

And then some other utilities for accessing the functions. These may include printing functions, drawing functions, publically available reference-to-internal-fields functions and so on.

The software utility is "published" to the rest of the group with a english language description of what each ADT represents, and then a list of creation, freeing and accessing operations on that ADT.

This actually adds up to surprisingly little documentation being required. I understand this will come as a great disappointment to all the enthusiastic documenters in the group.

Standard ADT functions

All data structures have a "make a copy of me and recursively copy all my contents" function. And all data structures have a "free me and recursively free all my contents" function.

If the data structure is call "plop", then the two above functions will be called

         plop *mk_copy_plop(plop *p)

and

         free_plop(plop *p)

Furthermore, any function that returns a plop, and has mk_ in its title (e.g.

           plop *mk_plop_from_qibble_and_squibble(quibble *q,sqibble *sq)
is guaranteed to produce a newly allocated plop, in which all subfields of plop are also allocated (if necessary by including copies of q and sq) so that after its creation nothing that happens to q affects plop, nothing that happens to plop affects q, etc.

Any plop that you create by calling a mk_ function, you must also eventually free with free_plop(p).

Quick Guide to useful ADTs already around.

These are described in the UTILS and DRAW documents.

DYM : Matrix (Dynamic in the sense of always being allocated dynamically; keeps track of its own size and validity). Dynamically alterable size and shape etc.

DYV : Vector (Dynamic too. Whee!)

IVEC : Vector of integers (Dynamic)

STRING_ARRAY : Dynamic array of strings. Surprisingly useful despite their stupidly long name. Often take on the role that simple Lisp lists take on in Common Lisp.

STRING_MATRIX : A matrix of strings!

APICT : Represents a simple line/dot/circle/text pictures. Renderable on X-windows, Windows, Postscript.

And then there are lots of specialized ADTs for machine learning, dynamic programming and combinatorial optimization.

Fine Points of Style

The following are our fine point programming style. We won't be upset if you elect a different style.

CapiTaLiZAtion, braces, indenting.

Many of us use lower-case for almost everything: variable names, structure names, function names, typedef names. Two exceptions: we capitalize macros. And global variables (not that there are many, of course) have a capitalized first letter.

      int An_evil_global = 7;
      
      #define AARDVARK 15.0
      
      typedef struct coordinate
      {
        double x;
        double y;
      } coordinate;
      
      double coordinate_magnitude(coordinate *coo)
      {
        double r = sqrt(coo->x * coo->x + coo->y * coo->y);
        if ( An_evil_global > 22 )
        {
          printf("Yippee! A wild evil global value!\n");
          An_evil_global = (int) AARDVARK;
        }
        return(r);
      }

Note a couple of points of very personal style above: most of us line our open and close curly braces above each other. A deep personal decision, only makeable after hours of reflection. The programmer above did return(r) instead of return r . Why? Even they don't know. Feel free to do differently.

Some people would have declared the function as

      double
      coordinate_magnitude(coordinate *coo)
      {
..that's fine.

We indent by two spaces. We usually try to avoid leaving tabs around in our code, but Visual C++ is very evil that way.

Header file conventions

Here's an example header file. Note how we use the common C idiom of preventing multiple header loading by means of an #ifdef. (This is a simplified version of xdamut/heap.h).

      /*
           File:         heap.h
           Author:       Andrew W. Moore
           Created:      14th February 1992
           Description:  The heap for A-Star search.
      */
      
      #ifndef HEAP_H
      #define HEAP_H
      
      #include "ambs.h"
      #include "amma.h"
      
      typedef struct pqueue
      {
        int size;
        heap *heap;
        pqelt **pqelts;
      } pqueue;
      
      pqueue *mk_pqueue( int max_input );
      void free_pqueue(pqueue *pq);
      bool pq_is_empty ( pqueue *pq );
      int pq_pop_best ( pqueue *pq );
      void pq_insert ( pqueue *pq, int i, double pri );
      bool pq_contains ( pqueue *pq, int input );
      
      #endif /* #ifndef HEAP_H */

utils/standard.h

If you #include any auton headers, or any headers which #include auton headers, you'll implicitly #include "standard.h" from UTILS. That file defines some project-wide constants, and especially some constants that depend upon which platform we are running. Precisely one of the following four constants will be defined:

         UNIX_TTY_PLATFORM -- this code will run under Unix and not produce
                              any graphics.  Any graphics calls will be
                              ignored, or possibly produce a printed warning
         UNIX_XW_PLATFORM  -- this code will run under Unix and is free to
                              produce graphics using damut/amgr.h graphics
                              routines
         PC_TTY_PLATFORM   -- this code will run on PCs (compiled by Visual C++)
                              and must not produce any graphics.  Any
      			graphics calls will be ignored, or possibly
      			produce a printed warning, or possibly pop up
                              graphics windows, though all control will remain
                              with the standard input.
         PC_MVIS_PLATFORM  -- this code will run on PCs, compiled by Visual
                              C++ as part of a single-document project.  It
                              requires the use of Mary's EYE files.  There
                              is no stdio/stderr output or input.  User
                              communication uses expos, apicts, and aform
      			interface functions.  The programmer is free
      			to produce graphics using damut/amgr.h graphics
                              routines.

Simple types

We almost never use the float type. double is preferred. Only use float if you really really feel the need to possibly save memory. Even if you give us a good story about why you're using floats, we'll frown.

In ambs.h (in UTILS) we declare the bool type, and #define TRUE and FALSE. Contrived example:

      bool exclusive_or(bool a,bool b)
      {
        bool result;
        if ( (a && b) || (!a && !b) )
          result = FALSE;
        else
          result = TRUE;
        return(result);
      }

(Note, by the way, a common thing you'll see. At least one of us doesn't like to call return from the middle of a function because it has undertones of "spaghetti code". So this person usually creates a variable called result which is returned at the end. Feel very very free to ignore that piece of over-purity!).