CONTENTS
Easy-to-Use Code
The Bottom Line
Abstract Data Type programming
----- ADT philosophy.
Standard ADT functions
Quick Guide to useful ADTs already around.
Fine Points of Style
----- CapiTaLiZAtion, braces, indenting.
----- Header file conventions
----- utils/standard.h
----- Simple types
There are many things to care about when writing code. For the Auton and SPR work, the order of importance is
1. Ease of use of each other's libraries
2. Ease for a new programmer to get involved
3. Portability
4. Ease of understanding the code
5. Useful Efficiency
6. Useful Modularity
7. Number of occurences of the words "foo" and "baz" in the code
8. Cuteness
9. Programming Style Ingenuity Demonstration
10. Useless Efficiency
We program in ANSI C. Some of us program in C++, but for directly applicable Auton code, or for SPR projects, please use ANSI C and not C++. The code must be very portable, and certainly must run on Unix and Windows 95, Windows NT. If the following conventions are used, and you use the UTILS (and DRAW if doing graphics) libraries for your I/O this won't be a problem for you.
Abstract Data Type programming
The main formalism we use to achieve ease of use is the "Abstract Data Type" style of programming, which could also be known as "Very Simple Object Programming Without The Fancy Little Crinkles." ADTs give most of the regular software engineering benefits you can think of, except that the joys and dangers of inheritence aren't directly available. The lack of using inheritence makes Andrew feel a little bit worried late at night, but he usually gets to sleep anyway.
If you're interested, see the first few chapters of "Object Oriented Software Construction" by Meyer (you can borrow this from Andrew).
In the ADT philospohpy, a programmer who decides to create a bit of software, first decides what are the one or two (maybe three or four, sometimes) primary datastructures they'll create.
Then they decide which operation(s) can CREATE one of these structures. The most common way they can be created is as a function of some other, perhaps more primitive data structures. And/Or being read in from a file. And/Or coming from the user.
Whenever possible, we strongly recommend that a creation function dynamically allocates memory for the new instance of the ADT, and COPIES in all necessary subcomponents. See later. These creation functions should RETURN the newly created instance, and their name should begin with "mk" or "mk_".
Then they write a FREE function to, er, free an instance.
And then some other utilities for accessing the functions. These may include printing functions, drawing functions, publically available reference-to-internal-fields functions and so on.
The software utility is "published" to the rest of the group with a english language description of what each ADT represents, and then a list of creation, freeing and accessing operations on that ADT.
This actually adds up to surprisingly little documentation being required. I understand this will come as a great disappointment to all the enthusiastic documenters in the group.
All data structures have a "make a copy of me and recursively copy all my contents" function. And all data structures have a "free me and recursively free all my contents" function.
If the data structure is call "plop", then the two above functions will be called
plop *mk_copy_plop(plop *p)
and
free_plop(plop *p)
Furthermore, any function that returns a plop, and has mk_ in its title (e.g.
plop *mk_plop_from_qibble_and_squibble(quibble *q,sqibble *sq)
is guaranteed to produce a newly allocated plop, in which all subfields
of plop are also allocated (if necessary by including copies of q and sq)
so that after its creation nothing that happens to q affects plop, nothing
that happens to plop affects q, etc.
Any plop that you create by calling a mk_ function, you must also eventually free with free_plop(p).
Quick Guide to useful ADTs already around.
These are described in the UTILS and DRAW documents.
DYM : Matrix (Dynamic in the sense of always being allocated dynamically; keeps track of its own size and validity). Dynamically alterable size and shape etc.
DYV : Vector (Dynamic too. Whee!)
IVEC : Vector of integers (Dynamic)
STRING_ARRAY : Dynamic array of strings. Surprisingly useful despite their stupidly long name. Often take on the role that simple Lisp lists take on in Common Lisp.
STRING_MATRIX : A matrix of strings!
APICT : Represents a simple line/dot/circle/text pictures. Renderable on X-windows, Windows, Postscript.
And then there are lots of specialized ADTs for machine learning, dynamic programming and combinatorial optimization.
The following are our fine point programming style. We won't be upset if you elect a different style.
CapiTaLiZAtion, braces, indenting.
Many of us use lower-case for almost everything: variable names, structure names, function names, typedef names. Two exceptions: we capitalize macros. And global variables (not that there are many, of course) have a capitalized first letter.
int An_evil_global = 7;
#define AARDVARK 15.0
typedef struct coordinate
{
double x;
double y;
} coordinate;
double coordinate_magnitude(coordinate *coo)
{
double r = sqrt(coo->x * coo->x + coo->y * coo->y);
if ( An_evil_global > 22 )
{
printf("Yippee! A wild evil global value!\n");
An_evil_global = (int) AARDVARK;
}
return(r);
}
Note a couple of points of very personal style above: most of us line our open and close curly braces above each other. A deep personal decision, only makeable after hours of reflection. The programmer above did return(r) instead of return r . Why? Even they don't know. Feel free to do differently.
Some people would have declared the function as
double
coordinate_magnitude(coordinate *coo)
{
..that's fine.
We indent by two spaces. We usually try to avoid leaving tabs around in our code, but Visual C++ is very evil that way.
Here's an example header file. Note how we use the common C idiom of preventing multiple header loading by means of an #ifdef. (This is a simplified version of xdamut/heap.h).
/*
File: heap.h
Author: Andrew W. Moore
Created: 14th February 1992
Description: The heap for A-Star search.
*/
#ifndef HEAP_H
#define HEAP_H
#include "ambs.h"
#include "amma.h"
typedef struct pqueue
{
int size;
heap *heap;
pqelt **pqelts;
} pqueue;
pqueue *mk_pqueue( int max_input );
void free_pqueue(pqueue *pq);
bool pq_is_empty ( pqueue *pq );
int pq_pop_best ( pqueue *pq );
void pq_insert ( pqueue *pq, int i, double pri );
bool pq_contains ( pqueue *pq, int input );
#endif /* #ifndef HEAP_H */
If you #include any auton headers, or any headers which #include auton headers, you'll implicitly #include "standard.h" from UTILS. That file defines some project-wide constants, and especially some constants that depend upon which platform we are running. Precisely one of the following four constants will be defined:
UNIX_TTY_PLATFORM -- this code will run under Unix and not produce
any graphics. Any graphics calls will be
ignored, or possibly produce a printed warning
UNIX_XW_PLATFORM -- this code will run under Unix and is free to
produce graphics using damut/amgr.h graphics
routines
PC_TTY_PLATFORM -- this code will run on PCs (compiled by Visual C++)
and must not produce any graphics. Any
graphics calls will be ignored, or possibly
produce a printed warning, or possibly pop up
graphics windows, though all control will remain
with the standard input.
PC_MVIS_PLATFORM -- this code will run on PCs, compiled by Visual
C++ as part of a single-document project. It
requires the use of Mary's EYE files. There
is no stdio/stderr output or input. User
communication uses expos, apicts, and aform
interface functions. The programmer is free
to produce graphics using damut/amgr.h graphics
routines.
We almost never use the float type. double is preferred. Only use float if you really really feel the need to possibly save memory. Even if you give us a good story about why you're using floats, we'll frown.
In ambs.h (in UTILS) we declare the bool type, and #define TRUE and FALSE. Contrived example:
bool exclusive_or(bool a,bool b)
{
bool result;
if ( (a && b) || (!a && !b) )
result = FALSE;
else
result = TRUE;
return(result);
}
(Note, by the way, a common thing you'll see. At least one of us doesn't like to call return from the middle of a function because it has undertones of "spaghetti code". So this person usually creates a variable called result which is returned at the end. Feel very very free to ignore that piece of over-purity!).