Carnegie Mellon

Computer Science Department |
 |
 |
 |
 |
 |
 |
 |
 |
|
|
|
15-410 Project 0: Traceback
Table of Contents
Project Overview
In this project you will be writing a "library" which contains a
single function called trackback() . traceback()
prints out a stack trace of the program it is called from. The stack trace will
include all of the function calls made to reach the current location in
the program. You will be provided with information about all of the
functions available in the program and their arguments.
One example of a possible use for such a function would be to call it
from a segmentation fault handler to help debug the program.
Traceback Details
The prototype for trackback, as defined in traceback.h, is
void traceback(FILE *);
The argument to traceback is the file stream to which the stack trace
should be printed. For most programs, this will probably be stderr ,
but taking it as an argument allows for greater flexibility in the use of
traceback.
Also defined in traceback.h is a table of all the functions in the
program. Each entry in the function table has the type functsym_t ,
which contains the name of the function and the address at which the function
begins along with a list of arguments.
Each argument is defined as an argsym_t containing the
argument type and name of the argument. The type is stored as an integer
and can be matched with the definitions in traceback.h.
For the sake of simplicity,
we are requiring you to recognize only char, int, float, double, char*,
and char**.
If the function list contains fewer than MAX_NUM_FUNCTIONS
it will be terminated by a function with a zero-length name.
Similarly, if the argument list
for a function contains fewer than MAX_NUM_ARGS arguments
it will be terminated by an argument with zero length name. The
functions in the list are sorted by address.
For each function you should print the name of the function and all of
the arguments. When printing each argument you should output the name and
the actual argument whenever the type is known. This means you must print
the string in the case of a char* and all of the strings in the case of a
char**. Be warned that traceback() must not cause a program
calling it to terminate due to a segmentation fault. If the type of an
argument is not known, you need not print the value.
For those of you wondering how you can have a global table containing
a program's function names and argument types, this is not normally possible
within the the C language framework.
Each test program linked against the traceback library will obtain the
code for your traceback() function and a blank function
table. After the program is built, a perl script will decode the
object file and modify it so that the table slots are filled in
with the correct information (see the lecture notes for a diagram).
This is not really the correct way to obtain this
information; one should obtain it at runtime by having a long and complicated
conversation with a large confusing library which understands how to parse
executable files.
The correct approach,
however, is significantly more work than intended for this project and does
not really add to the learning experience as it is just an exercise in
jumping through hoops.
Formatting
traceback() should output the functions in order from the last
(most recent) function
called to the first function called. It should contain the names and values
of all of the arguments (and void if there are no arguments).
The output of traceback() should match the following sample partial
output:
Function foo(int i=5, float f=35.000000), in
Function foobar(char c='k', char *str="test", char *unprintable=0xffff0000), in
Function bar(void), in
This indicates that some function (not shown) called bar() with no
arguments. bar() then called foobar with a character 'k', a string "test",
and a string called unprintable, located at 0xffff0000 in memory, which
traceback() was unable to print.
foobar() in turn called foo() with the arguments 5 and 35,
and foo() invoked traceback() .
All arguments are printed as "type name=value", but the following special
rules should also be applied:
- chars should be printed between single quotes
- integers and floating-point numbers should be printed in base 10
- strings should be printed between double quotes
- string arrays are displayed in the format
{"string1","string2","string3"}. The quotation marks are to be added around
each string by the output function; they are not part of the string. If
a string in the array is not printable, the address of that string
should be printed in its place.
- If there are more than 3 strings in an array, only the first 3
should be printed followed by a "..."
(eg: {"string1", "string2", "string3", ...}).
- If a string has more than 25 characters, only the first 25 should
be printed followed by a "..." (eg: "this string has more than 25 characters" should be printed as "this string has more than"...)
- anything that cannot have its value printed for any reason should
have its address printed in hex. If part of a string is printable and
part is not, then the entire string is considered to be unprintable.
Goals
Despite the fact that this is the smallest project of the five that will
be assigned in this class, it is important to pay attention to the key
concepts in Project 0. The ideas taught here will provide the foundation
for the next four projects. In particular, we would like you to be
comfortable with:
Writing clean code in C. Many people like the C programming language
because it gives the programmer a lot of freedom (pointers, casting,
etc). It is very easy to hang yourself with this rope. Developing (and
sticking with) a consistent system of variable definitions,
commenting, and separation of functionality is essential.
People have asked about using C++ in this class. This is probably
much harder than you think, since you would need to begin by
implementing your own thread-safe (or, at least, interrupt-aware)
versions of new and delete. In addition, you would
probably find yourself implementing other pieces of C++ runtime code;
this could turn into quite a hobby. As a result, you should do
this program in C as a way of re-familiarizing yourself with the
language you'll be using for the remainder of the course.
Writing psuedocode. For systems programming, it is very important to
think out crucial data structures and algorithms ahead of time since
they become important primitives for the rest of the system.
Commenting. Though you will not be working with a partner for the
first two projects, you will be on all subsequent projects. It is
important to include comments so someone else looking at or
maintaining your code can quickly understand what your code is doing
without having to look at its internals. For this assignment,
which is a refresher, it should not be hard to comment it
appropriately and you may do so in the standard fashion. However,
since the remainder of the assignments will use it, we will
describe the doxygen system, similar to
javadoc for C.
- Using common development tools (gcc, ld).
- Communicating with the TAs using various channels of communication
(zephyr, bulletin board,
,
Q&A archive, course web page, office hours).
Getting Started
You will probably find yourself wishing for some information
which is not portably available within the C language framework,
so you will need to write a scrap or two of x86 assembly language.
We suggest you do this by writing a C-callable function in a .S
file (note that the 'S' is upper-case) rather than using the
asm() in-line assembly language facility.
Either one will work, but in practice it is very easy to write
code with asm() which works with one version of your
program or a particular version of your compiler but which
breaks mysteriously later. In addition, littering your
C code with asm() calls makes it extremely painful
to port the code from one hardware platform to another.
The support code includes a sample .S file,
and you can find asm() covered in the
"Assembler Instructions with C Expression Operands"
section of the gcc documentation.
If, despite our advice, you decide to use asm(),
keep in mind that for correctness you must
use the "complicated" version which correctly communicates
your intent to the compiler.
Important Dates
- Wednesday, September 1st: Project 0 assigned.
- Wednesday, September 8th: Project 0 is due at 11:59pm.
Testing
It is important that your trackback() function be able to
deal with any sort of program in which someone might wish to use it. You
must ensure that it will work properly regardless of where it is called within
any program, and that traceback() does not damage the correct
operation of the program after it returns.
Take some time to develop the evilest harshest cases that you can
because while grading we will submit your code to the most diabolical
tests we can imagine. Of course, if your code is well written, it should
have no problems passing these tests.
Also, for your convenience, we will provide an output verification script
which will ensure that your output format matches our script's expectations. We will make a post to the bboard when it is available.
Documenting
Commenting is an important part of writing code. If you wish,
you may get a jump on future assignments by using doxygen; see our
doxygen documentation to see how to
include comments in your code that can be read by doxygen.
When we grade
your projects, we will begin with your documentation.
Lack of
documentation will be reflected in your grade.
The provided
traceback_internal.h
file contains example doxygen comments with the sort of information we are
expecting to see. Although we put the doxygen comments for our functions
in the .h file, you should typically put yours in the .c file, with each
function's comment block adjacent to the code.
In
addition, we have provided a rule in the Makefile to take care of
generating the documents for you. This rule is make html_doc and
if you have set this up to work we will run it as part of grading.
Other Important Notes
Since we will be running and testing your code on Andrew Linux
machines, your code will be compiled, linked, and run under gcc 3.2.1.
If you are working on standard cluster machines, then you don't have to
worry about anything. If you are working on a non-cluster personal machine,
you can check the
version of gcc you are using by running gcc --version on the
command line. If your version is not 3.2.1, you must make sure that
your code compiles, links, and runs fine under 3.2.1.
Please do not change any of the provided files except for traceback.c
and traceback.mk. Modifying traceback.mk should allow you to make any
changes necessary for compiling the traceback library and any test
programs. We will run your code using our versions of the files, so
any changes you make to other files will be overwritten.
As compiling many different tests can take a noticable amount of time,
we just wanted to mention that the Makefile allows you to build
a subset of your tests.
Typing make foobar will compile the
foobar test (after updating the traceback library if necessary).
While you probably do not need to use any 410-built programs
for this assignment, you will probably want to set things up
so that /afs/cs.cmu.edu/academic/class/15410-f04/bin is on your
$PATH. For your convenience, you may wish to make
an easy-to-type symbolic link to the root of the course AFS
volume, e.g.,
% ln -s /afs/cs.cmu.edu/academic/class/15410-f04 $HOME/410
Your AFS volumes have not been
created yet, and the
class bulletin boards still contain posts from the previous
semester. We know about these issues and the relevant
parties are working on them...luckily, they should not
impede your work as you begin this project.
Hand-in Instructions
You will be required to hand in all your .c, .S, .h, and any other
files necessary to run your code. Minimally this will include the
traceback function and any support functions that it requires. When
we run your code, it should display the behavior described in the
Traceback Details section above.
See http://www.cs.cmu.edu/~410-f04/p0/handinP0.html
for details.
evil_test Hints
You may be wondering how your program can determine whether
a given address is valid (i.e., backed by memory) at run-time.
Like many other questions which will arise as this course unfolds,
there are multiple approaches, with different tradeoffs. In general you
should strive to identify two to three approaches, choose among them
based on weighing a variety of criteria, and briefly document
the thinking behind your choice.
But since Project 0 is a warm-up, it seems appropriate to
give a few hints.
- A segmentation fault need not necessarily kill your program.
Recall from 15-213 what causes a segmentation fault, how a typical
Unix kernel reacts, and what control you have over that sequence
of events,
- If you carefully study the documentation for various system calls,
such as msync() and write(), you may find a way to
(ab)use one of them to your benefit,
- The documentation for the proc pseudo-file-system may
be of use to you.
For this assignment it is more important that whichever way
you address this issue is done well (completely and
cleanly) than that you choose the alternative which is our
favorite.
|