Carnegie Mellon
SCS logo
Computer Science Department
home
syllabus
staff
schedule
lecture
projects
homeworks
 
 

15-410 Project 4: PebTrace Debugging


Overview and Motivation

So far this semester you have focused on how threads operate more or less normally within the privacy of their respective address spaces. Meanwhile, in previous courses you have used tools such as gdb and strace which are programs that inspect and/or modify the behaviors of other threads. In this semester's Project 4 you will extend your kernel with a facility, "PebTrace", which allows one Pebbles program to monitor, inspect, and modify the execution of other Pebbles programs.

Thread Tracing

The basic idea is this: while one thread is monitoring or debugging another thread, the target thread will at various times report its execution state to the thread overseeing its execution, which will inspect the reported state, possibly make changes, and then resume the execution of the target thread. The execution state of an individual thread will be communicated back and forth in terms of an augmented version of the ureg_t structure that you are already familiar with; in addition, a system call allows inspection and modification of the target thread's address space. By carefully combining various PebTrace features, it is possible for a trace/debugger program, even if it is single-threaded, to monitor and control the execution of a target program, even if it is multi-threaded.

Fundamentally, a traced thread reports its state when it encounters an exception or executes a system call. Because the thread controlling a traced thread can completely rewrite the traced thread's state, and can even cancel a system call before it executes, execution can be modified in essentially arbitrary ways.

The new architecture can be described in terms of the following feature list.

  1. Each thread may optionally be traced by some other thread (just one). A tracing thread will often be referred to as the "tracer" and a tracing thread will often be referred to as the "tracee". Each tracer thread may have multiple tracees, and may also be traced itself. (The graph formed by these tracing relationships may have cycles, but this is unlikely to be a productive thing to do.)

  2. A traced thread may be in one of three "trace states": Normal, Notifying, and Waiting. The "Normal" state corresponds to typical forward execution of instructions.

  3. If a thread is being traced, when any "trace event" (defined below) occurs, the tracee will pause, enter the Notifying state, and notify its tracer of the event, including its current state (encoded in a ureg_t struct as described below).

  4. Once the tracer has received the notification via the pebtrace_wait() system call, the tracee enters the Waiting state, where it waits for a new state from the tracer, and can have its memory modified by the pebtrace_mem() system call. Once the tracer has restarted the tracee with the pebtrace_continue() system call, the tracee reenters the Normal state and resumes execution. The tracer may have modified its execution state before resuming it, as described below.

  5. If a thread which has a tracer succeeds in creating a new thread through a call to fork() or thread_fork, the new thread will begin execution being traced by the same tracer thread (tracing is "inherited" across fork() and thread_fork).

  6. If a tracing thread exits, all threads it is tracing are "detached" (see below).

Event Notification Format

A traced thread reports to its tracer when it is attached to by an invocation of pebtrace_stop(), when it encounters an exception, when it attempts to begin a system call, when it completes a system call, and, finally, when it vanishes. From the point of view of the tracer, execution of a system call is "atomic": once a traced thread is allowed to begin a system call, the tracing thread will not hear anything further about the traced thread until the system call is about to return to user space.

When the kernel reports the status of a traced thread to a tracing thread, it reports the traced thread's registers and also the reason why the traced thread stopped. Because the Pebbles kernel specification already contains a structure for reporting register values and causes, the ureg_t struct used for the swexn() system call has been augmented to provide information about a tracee's state to the tracer and also for a tracer to modify the state of a Waiting tracee.

When a tracer uses pebtrace_wait() to receive an event from a tracee, it receives a ureg_t filled with the tracee's state. Since PebTrace needs to report more information than swexn() (specifically, which trace condition caused the tracee to stop)), the higher order bits of the cause field are dedicated to indicating this. We provide the following constants for this purpose:

// Mask to get just the parts of the cause that swexn would use
#define TRACE_SWEXN_CAUSE_MASK 0xff
// Mask to get the extra cause information pebtrace adds
#define TRACE_EXTRA_CAUSE_MASK 0xe0000000

#define TRACE_CAUSE_SHIFT 29
#define TRACE_CAUSE_EXITED         (0x1 << TRACE_CAUSE_SHIFT)
#define TRACE_CAUSE_SYSCALL_ENTER  (0x2 << TRACE_CAUSE_SHIFT)
#define TRACE_CAUSE_SYSCALL_EXIT   (0x3 << TRACE_CAUSE_SHIFT)
#define TRACE_CAUSE_STOPPED        (0x4 << TRACE_CAUSE_SHIFT)
#define TRACE_CAUSE_FAULT          (0x5 << TRACE_CAUSE_SHIFT)

Handling Events

Above we listed the trace events in "thread life-cycle order"; in this section we will describe them in greater detail and in "complexity order".

Events

The events that trigger a notification (and transition a thread from Normal to Waiting) are:

  1. Entering a system call
    • cause will be (TRACE_CAUSE_SYSCALL_ENTER | syscall_no).
    • eip should point to the instruction that triggered the trap, not the instruction after it. (In particular, your implementation should probably report an eip value which is two less than the value of eip that the hardware pushed onto the kernel stack during the mode switch.)
    • error_code and cr2 will be zero.
  2. Finishing a system call
    • cause will be (TRACE_CAUSE_SYSCALL_EXIT | syscall_no).
    • eip will point to the next (user-space) instruction to execute after the syscall returns.
    • error_code and cr2 will be zero.
  3. Taking an exception
    • cause will be (TRACE_CAUSE_FAULT | fault).
    • eip will be the address of the faulting instruction.
    • If the fault was a page fault, cr2 will be the value of cr2 set when the fault was taken, otherwize zero.
    • If the fault pushes an error code, error_code will be that code, otherwise zero.
    • Note that any "secret" page faults that are handled by the kernel should not be reported to the tracer as trace events.
  4. Invoking a swexn() handler
    • cause will be (TRACE_CAUSE_SYSCALL_EXIT | SWEXN_INT). This is not a typographical error: the tracer will observe the "completion" of a fake system call that was not previously started.
    • The ureg_t struct will contain the execution state that will be used to launch the handler. In particular, eip will be the first instruction of the handler and esp will point into the exception stack.
    • Note that before the tracer receives the handler-invocation event the kernel has pushed state onto the user-space exception stack (esp3) and de-registered the handler.
  5. Attempting to run the first userspace instruction after being told to stop by a call to pebtrace_stop(), if none of the above trace-event conditions apply
    • cause will be TRACE_CAUSE_STOPPED.
    • eip will be the address of the next user-space instruction to execute.
    • error_code and cr2 will be zero.
  6. Exiting
    • cause will be TRACE_CAUSE_EXITED
    • All other fields will be zero.
    • Once the notification of this event is delivered, the traced thread no longer exists. It is automatically detached from. It does not enter the Waiting state after the notification has been received.

Resuming Execution

Continuing a thread that is in the Waiting state should behave as follows:

  1. If cause is (TRACE_CAUSE_FAULT | fault_no) for some fault_no that is the number of a fault that can be generated by user code, then an exception is delivered to the thread. This means invoking the software exception handler if one is registered and killing the thread if one is not.
  2. Otherwise, the cause field is ignored, and how the new state is handled depends on details of the event that caused the notification:
  3. If the event was caused by a system call entry, then:
    • If the eip field of the ureg_t struct provided as a parameter to the pebtrace_continue() system call which resumes the thread still points to the address of the instruction that was reported as having triggered the trap, then the system call begins execution. Note that the system call executes using the register values specified in the ureg_t struct, not the values that were in the registers when the trap instruction was first executed.
    • If the eip field has been changed, then the system call is not executed. The thread resumes execution in user space with registers as specified by the new ureg_t struct.
  4. If the event was caused by an exception, the exception is not delivered (it has been "swallowed", see below). The thread resumes execution in userspace with registers as specified by the new state. Intuitively, the tracer has "fixed" the problem that caused the exception--perhaps by modifying memory, perhaps by modifying register values. Of course, it is possible that the resumed tracee could fault again for some other reason.
  5. If the event was caused by finishing a system call or being stopped by a call to pebtrace_stop(), then the thread resumes execution in userspace with registers as specified by the new state.

A few cases are a little subtle:

  • When the kernel reports the invocation of a software exception handler, it reports it as though a fictitious swexn() system call is returning.
  • Before execution begins of a thread newly created with fork() or thread_fork, the kernel generates an event which appears to be the return from a fork() or thread_fork system call that was actually made by the parent thread. Note that this means that a tracer is likely to see one fork() or thread_fork call return twice (with different tids).
  • Before execution begins of a thread starting at the entry point of a program after a call to exec(), the kernel generates an event which appears to be the return from an exec() call which was actually made by the thread, but when it was running a different program in the previous address space.

The flow of control for exceptions is a little complicated:

  1. First the kernel delivers a "thread ran into an exception" event.
  2. The tracer can choose to deliver the exception (in general the ureg_t struct in the event should be suitable for delivering the exception) or to swallow it (by specifying a ureg_t struct lacking TRACE_CAUSE_FAULT in the cause field).
  3. If the exception is delivered, the tracer can expect that the tracee will "quickly" stop again, due to either TRACE_CAUSE_EXITED (if no handler was registered or it couldn't be invoked) or else (TRACE_CAUSE_SYSCALL_EXIT | SWEXN_INT) (if the tracee is ready to begin the handler).
  4. The tracer can use this as an opportunity to learn the address of the handler and the exception stack and potentially to inspect the tracee's memory.
  5. At this point the tracer can either continue the tracee to invoke the handler (in general the ureg_t struct will be appropriate) or can modify state in such a way that the handler will not be invoked and then continue the tracee.

System Call Specification

The following are defined by syscall.h.

int pebtrace_stop(int tid);

Instructs the thread with thread id tid to suspend execution. If that thread is not currently traced by the calling thread, it becomes traced by the calling thread. After this call succeeds, the target thread should not execute any more user space instructions until continued or detached.

Note that pebtrace_stop() may return before the target thread has entered the Notifying state.

Fails if the target is already traced by another thread, is the calling thread, or belongs to the init process. Returns zero on success, an integer error code less than zero on failure.

int pebtrace_detach(int tid);

Detaches from a tracee thread. That thread becomes untraced and resumes running normally.

Returns an integer error code less than zero if tid is not the thread id of a thread currently being traced by the calling thread. Returns zero otherwise.

int pebtrace_wait(ureg_t *state);

Collects an "event" from one of the traced threads as discussed above. If none of the caller's traced threads have uncollected events (that is, none of them are in the Notifying state), block until an event occurs. The state of the tracee and information about the event will then be filled into state and the return code will be the thread id of the traced thread.

If the caller is not tracing any threads, pebtrace_wait() will return an integer error code less than zero. If state does not refer to writable memory, pebtrace_wait() will return an integer error code less than zero instead of collecting an event.

int pebtrace_continue(int tid, ureg_t *state);

Continues the execution of the thread with thread id tid, which must be traced by the current process and in the Waiting state. It will be resumed with the state described by state, as discussed above.

If tid is not the thread id of a traced thread in the Waiting state, or state does not describe a valid state for a thread, then an integer error code less than zero is returned. Zero is returned on success.

typedef enum { PEBTRACE_WRITE, PEBTRACE_READ } pebtrace_mem_mode;
int pebtrace_mem(int tid, pebtrace_mem_mode mode,
                 unsigned int tracee_addr, void *tracer_addr,
                 unsigned int size);

Reads or writes a range of memory in the traced thread with thread id tid's address space. The tracee must be in the Waiting state. If mode is PEBTRACE_WRITE, copy size bytes from tracer_addr in the current address space to tracee_addr in the traced thread's address space. If mode is PEBTRACE_READ, copy size bytes from tracee_addr in the tracee's address space to tracer_addr in the current address space.

In order to support inserting breakpoint instructions, PEBTRACE_WRITE must be able to write to read-only memory in the traced thread's address space. The kernel may impose the restriction that the entire memory range in the traced thread's address space must lie on a single page. Also, depending on the implementation of your kernel, it is permissible for a PEBTRACE_WRITE operation to fail in low-memory conditions.

If tid is not the thread id of a traced thread in the Waiting state, or the memory regions specified are invalid, then an integer error code less than zero is returned. Returns zero otherwise.

The kernel MUST ensure that it does not allow a thread to assume register values which are unsafe in the sense of allowing the thread to crash the kernel (there is no requirement, however, that the kernel protect a thread from assuming register values which will cause the thread to "crash").

Deliverables

  • The system call entry and exit code and the exception handling code in your kernel will need to be modified.

  • You will probably need to add support for the breakpoint/INT3 exception (IDT entry 3).

  • The kernel must implement the full set of pebtrace system calls.

Don't forget to make veryclean when you submit your code. Thanks!

Some Test Code

The following test programs have been provided for your enjoyment.

strace
Traces the exection of a specified process.
pdb
A pebbles debugger.
inject
Injects a new thread into an already running task. This is not a particularly principled thing to do.

Getting Started

  1. Begin with a copy of your p3 directory tree. Read the directions contained in the P4 tar file and distribute the provided files accordingly.
  2. Do an update and marvel at the new files which arrive.
  3. Read through this document in its entirety.
  4. Begin designing. ...

Have Fun!

Make sure to have some fun with this project. You've earned it, right?


[Last modified Monday November 19, 2012]