Notes on threads: ;;;; Lisp level functional interface. Individual threads are represented by a thread structure. The functions for creating/manipulating threads are: make-thread function &key stack-size initially-suspended name plist => thread Creates a new thread that will initially call FUNCTION. thread-self => thread Return the thread object for the currently running thread. thread-yield Inform the scheduler that now would be a good time to schedule someone else. thread-exit value-form [macro] Terminate the current thread. Acts as if the initial function return returned with VALUE. Unwind-protects *are* processed. All the values of value-form are returned, just like THROW. thread-result thread => t Returns the value(s) returned by the function supplied to MAKE-THREAD or the value(s) supplied to THREAD-EXIT. If the thread is still running, this blocks until it isn't. thread-name thread => t Allows the user to associate a name to a thread. The name is not used by anything except the thread print function. thread-plist thread => list Allows the user to associate thread specific info to a thread. thread-suspend thread => void thread-resume thread => void Suspends (or resumes) the supplied thread. ### Do we want to export this functionality? Might be useful for trying to make advanced debugger interfaces to threads. thread-kill thread &optional process-unwind-protects => void Zot the thread. thread-status(thread) => (member :running :blocked :waiting :suspended :killed :exited) Returns the current status of thread: :running - self explanatory :blocked - blocked trying to lock a mutex. :waiting - waiting on a condition object. :suspended - hit with thread-suspend :killed - hit with thread-kill :exited - either called thread-exit or the initial function returned. do-threads((var &optional result-form) &body body) Iterate over all the threads in the system. sleep(seconds) Suspend this thread for SECONDS seconds. It is guarenteed that this function will not return until at least SECONDS seconds, but maybe more. Other threads may run during this time. Note: cthreads has a function ``thread-detach'' that tells it not to bother saving the return value for a thread. We obviously don't need that, because that is what the garbage collector is for. Internal routines used by mutexes and events. thread-sleep new-status Puts the current thread to sleep. Someone else better be prepared to call thread-wakeup on this thread. While sleeping, change the thread status to new-status. thread-wakeup thread Wakeup the supplied thread. Mutual exclusion is supported by mutex structures, which is a binary semaphore. These are just like cthreads mutexes. make-mutex &key name initially-locked Create a new mutext. mutex-name mutex Return (or set with setf) the name associated with this mutex. Only used by the print-function. mutex-lock mutex Attempt to lock the mutex, and block until you can. Note: no attempt is made to protect against the case of a single thread attempting to lock a single mutex twice (which will deadlock it). [In fact, that is a useful operation if you know someone else will be unlocking the mutex later.] mutex-unlock mutex Unlock the mutex. Note: no attempt is made to assure that the unlocked actually held the lock. mutex-try-lock mutex Just like mutex-lock, but don't block. Returns T if the lock was acquired, and NIL if not. mutex-locked mutex Return T if the mutex is currently locked, and NIL if not. with-mutex (mutex) &body body [macro] Lock the mutex, execute the code in body, and then unlock the mutex. The unlock is in an unwind-protect. Event structures can be used to synchronize multiple threads. These are just like the cthreads conditions, but name is different to keep from conflicting with Common Lisp conditions. make-event &key name Make a new event object. event-name event Return (or set) the associated name. Again, only used by the print function. event-wait event mutex Wait for some event to arise. The mutex is unlocked and the current thread is put in a :waiting state (this happens atomically). When the event is signaled, attempt to re-acquire the lock on the mutex and return. As some other thread might have beaten this thread to the lock, the condition should be verified again. For example: (with-mutex (*mutex*) (loop (if (return) (event-wait *mutex* *event*))) ) on-event (mutex event-var) condition-form &body body [macro] Syntactic sugar for the standard way to use events. The above example becomes: (on-event (*mutex* *event*) ) event-signal event Signal that EVENT has happened. This wakes up only one of the thread that are waiting on the event. event-broadcast event Same as event-signal, but wake up all threads waiting on the event. Other models of parallelism and synchronization can easily be built on top of these. This set of primitives has been selected because other threads packages seem to think that they are enough and that they all can be efficiently implemented. ;;;; Thread groups and interaction sessions Each thread can be part a member of some thread-group. When a thread is created, its thread-group defaults to that of the creating thread. The thread-group of a thread cannot be changed after creation. Each thread-group can be executing in the foreground of some interaction-session or in the background. When a thread-group is created, it is in the background. Later, it can be brought to the foreground of some interaction session. If a thread ever tries to invoke the debugger, the thread is suspended. If the thread is part of some thread-group, all other the threads in that thread-group are suspended. If the thread-group is executing in the foreground, If a thread ever tries to invoke the debugger, it and any other threads in it's thread group are suspended. Additionally, if the thread is a member of some thread-group, the interaction session for that thread group is notified. thread-thread-group(thread) Return the thread-group for the supplied thread. make-thread-group(&key name plist internal) thread-group-name(thread-group) thread-group-plist(thread-group) thread-group-threads(thread-group) thread-group-suspend(thread-group) thread-group-resume(thread-group) thread-group-kill(thread-group) thread-group-interaction-session(thread-group) Return the interaction-session that this thread group was most recently executing in the foreground of. If this thread group was never in the foreground, return NIL. make-interaction-session(signal-handler ???) Create a new interaction-session. terminate-interaction-session(session) The supplied interaction-session is terminated. If a thread-group is executing in the foreground of this session, it is moved to the background first. interaction-session-foregound-thread-group(session) Return the currently active thread group for the supplied session. Can be NIL if no thread-group is in the foreground of this session. Setting this to NIL will return the thread-group to the background, and setting it to a thread-group will put that thread group in foreground (after backgrounding the current foreground thread group if necessary). interaction-session-signal-handler(session) The function used by the supplied session to raise conditions. The thread the function is executed in is not necessarly related to the interaction session. signal-interaction-session(session datum &rest arguments) Raise some condition in some session. The datum and arguments are coerced into a condition object which is passed to signal-handler for the supplied interaction session. ;;;; Debugger/Top-level Given that we will want both the debugger and the top level REP to be able to manipulate threads/thread-groups, it seems reasonable that they be unified. Along these lines, the top-level REP is just the debugger without any thread current. :jobs - list the current thread group :job - specify which job to consider the current thread group :foreground, :fg - resume executing the current thread group in the foreground :background, :bg - resume executing the current thread group in the background :threads - list the threads in the current thread group :all-threads - list all threads :thread - specify which thread to consider current ;;;; Terminal IO system:*tty* system:*stdin* system:*stdout* system:*stderr* *terminal-io* synonym of system:*tty* *standard-input* synonym of system:*stdin* *standard-output* synonym of system:*stdout* *error-output* synonym of system:*stderr* *trace-output* same as *error-output* *query-io* synonym of *terminal-io* *debug-io* synonym of *terminal-io* If the lisp is fired up with the ``-batch'' switch, then the lisp is considered to be in batch mode. In this case, *terminal-io* is not bound to /dev/tty, and nothing special is done to multiplex I/O from different thread-groups. ;;;; Additional areas for research. What forms of inter-thread communication? thread-signal(thread datum &rest arguments) Converts datum & arguments into a condition, and signals it in thread. ;;;; Special binding details. Need thread specific cache. ;;;; Spin locks. Spin locks are a primitive built-in type that are directly implemented by the compiler. make-spin-lock spin-lock-grap spin-lock spin-lock-release spin-lock spin-lock-locked? ;;;; Random code. (defstruct mutex ;; ;; The name for this mutex. Only used for printing. (name nil :type t) ;; (spin-lock (make-spin-lock) :type spin-lock :read-only t) ;; ;; Whether or not this mutex is currently locked. (locked nil :type (member t nil)) ;; ;; The threads that have blocked trying to lock this mutex. (blocked nil :type list)) (defun mutex-lock (mutex) (let ((sl (mutex-spin-lock mutex))) (loop (spin-lock-grab sl) (unless (mutex-locked mutex) (return)) (push (thread-self) (mutex-blocked mutex)) (spin-lock-release sl) (thread-sleep :blocked)) (setf (mutex-locked mutex) t) (spin-lock-release sl)) t) (defun mutex-try-lock (mutex) (let ((sl (mutex-spin-lock mutex))) (spin-lock-grab sl) (let ((result (if (mutex-locked mutex) nil (setf (mutex-locked mutex) t)))) (spin-lock-release sl) result))) (defun mutex-unlock (mutex) (let ((sl (mutex-spin-lock mutex))) (spin-lock-grab sl) (unless (mutex-locked mutex) (spin-lock-release sl) (error "Attempt to unlock a mutex that isn't locked: ~S" mutex)) (dolist (thread (mutex-blocked mutex)) (thread-wakeup thread)) (setf (mutex-blocked mutex) nil) (setf (mutex-locked mutex) nil))) (defstruct event ;; ;; The name of this event. Only used for printing. (name nil :type t) ;; ;; Spin-lock used to protect the blocked list. (spin-lock (make-spin-lock) :type spin-lock :read-only t) ;; ;; List of all the threads that are waiting on this event. (blocked nil :type list)) (defun event-wait (event mutex) (let ((sl (event-spin-lock event))) (spin-lock-grab sl) (push (thead-self (event-blocked event))) (mutex-unlock mutex) (spin-lock-releaes sl) (thread-sleep :waiting))) (defun event-signal (event) (let ((sl (event-spin-lock event))) (spin-lock-grab sl) (let ((waiters (event-blocked event))) (when waiters (setf (event-blocked event) (cdr waiters)) (thread-wakeup (car waiters))) (spin-lock-release sl)))) (defun event-broadcast (event) (let ((sl (event-spin-lock event))) (spin-lock-grab sl) (let ((waiters (event-blocked event))) (setf (event-blocked event) nil)) (spin-lock-release sl) (dolist (waiter waiters) (thread-wakeup waiter)))) ;;;; Implementation notes. Information about threads is grouped into three different categories. There is the stuff only used by the lisp-level support code. This is all encapsulated in a defstruct somewhere. These structures are what gets manipulated at the lisp level. Information needed by the running thread itself is kept in a mutator structure located on the stack of the thread itself. While running in lisp, a pointer to this structure is kept in a register. While running in C, this pointer can be computed from the stack pointer. The ``thread'' field holds the Lisp structure pointer. Information needed by the OS support is stored after the mutator structure. As only the OS support needs this information, only the OS support stuff needs to know it is there. Lisp level info: name, plist, status, results mutator pointer Mutator info: foreign-fn-call-active control-stack start and pointer. current-unwind-protect current-catch-frame binding-stack start and pointer. number-stack start and pointer. interpreter stack, etc. ephemeral region start, fill-pointer, size. write-list start, fill-pointer, size. interrupts-enabled, interrupt-pending, pending-signal, etc. words-consed ;;;; Low-level iterface. The C-level interface looks something like: void mutator_spawn(void func(void *arg), void *arg) Spawn a new thread, and have it call func(arg). void mutator_exit(void) Kill and deallocate the current thread. void mutator_sleep(void) Put the current thread to sleep. void mutator_wakeup(struct mutator *thread) Wake the supplied thread up. struct mutator *mutator_self(void) Return the thread structure for the current thread. struct mutator *mutator_from_nsp(void *nsp) Return the thread structure for the thread with the stack that nsp points into. void mutator_suspend(struct mutator *thread) Suspends thread. If thread is currently in a pseudo-atomic, let it finish the pseudo-atomic first. void mutator_resume(struct mutator *thread) Undoes mutator_suspend. ;;;; MACH implementation ideas. MACH supports threads directly, so scheduling is taken care of for us. The only thing we have to do is figure out how to put threads to sleep and wake them up. Two ideas: Idea one: to put a thread to sleep call msg_rcv. To wake it up, send it a message. This is rather simple, but it complicates save and restore. It also requires one mach thread per lisp thread, even for lisp threads that are asleep. Idea two: maintain a queue of stopped lisp threads and a pool of mach threads. When we want to wake up a lisp thread, grab a mach thread from the pool (or create a new one if the pool is empty) and use thread_set_state to set it up. And then let it rip. When we want to go to sleep, use save_state to record our state, and then add us to the pool of available threads. Idea two details: mutator_create: spin_lock_grab(thread creation lock) allocate stacks spin_lock_grab(thread_pool_lock) extract thread from pool spin_lock_release(thread_pool_lock) use thread_set_state to setup the new thread link the new thread in spin_lock_release(thread creation lock) mutator_sleep: spin_lock_grab(mutator lock) if (--awake == 0) save_state(go_to_sleep) else spin_lock_release(mutator lock) go_to_sleep(nsp) saved-nsp = nsp state = asleep spin_lock_release(mutator lock) spin_lock_grab(thread_pool_lock) add thread to thread pool spin_lock_release(thread_pool_lock) thread_suspend(thread_self) mutator_wakeup(mutator) spin_lock_grab(mutator lock) if (awake++ == 0) spin_lock_grab(thread_pool_lock) grab thread from pool spin_lock_release(thread_pool_lock) use thread_set_state to setup this threads state. spin_lock_release(mutator_lock) thread_resume(thread) else spin_lock_release(mutator_lock) mutator_suspend(mutator) spin_lock_grab(mutator lock) if (--awake == 0) thread_suspend(thread) thread_get_state(thread) if (pseudo_atomic-atomic) set-pseudo-atomic-interrupted thread_resume(thread) msg_recv status = suspended spin_lock_release(mutator lock) spin_lock_grab(thread_pool_lock) add thread to pool spin_lock_release(thread_pool_lock) else spin_lock_release(mutator) mutator_resume(mutator) same as wakeup. ;;;; SunOS support ideas. SunOS doens't support threads in the kernel. They have a C library that builds thread support on top of interval timer signals, but we can't use it because it doesn't allow us to GC the registers. So we have to build our own. Threads have an ``awake'' count. If this is zero of less, then the thread is asleep and should not be run. If this is greater then zero, then let the thread run. Threads also have a status: fetal, running, sleeping, signaled. When fetal, the thread has been allocated, but hasn't been started (i.e. born) yet. When running, it is running. When sleeping, it was suspended via a call to mutator_sleep. When signaled, it was suspended via a signal (probably the timer interrupt). mutator_spawn: turn all signals off. allocate everything, marking status as fetal. add new thread to runnable queue. turn signals back on. mutator_sleep: turn signals off if (--awake == 0) save_state(go_to_sleep) turn signals back on. go_to_sleep: remove from runnable queue store nsp in current thread. mark thread as asleep. schedule_next_thread mutator_wakeup: turn signals off if (awake++ == 0) add thread to runnable queue turn signals on mutator_suspend: turn signals off if (--awake == 0) remove thread from runnable queue turn signals on mutator_resume: same as mutator_wakeup schedule_next_thread: pick next thread to schedule switch (status) case fetal: call_on_stack(init_func) case asleep: resore_state(saved nsp) case signaled: sigreturn(saved sigcontext *) ;;;; Signal/Exception handling. Signals and Exceptions come in two flavors: ones that are specific to a particular thread, and ones that are not. When specific to a particular thread, that thread is stopped by the OS, and we better not restart it until we have cleared up whatever the problem was. When not specific to a particular thread, the OS may have stopped some random thread. In which case, we should restart that thread immediately after spawning the handler thread. Under MACH, all exceptions are thread-specific. And if we pick off all exceptions before they get forwareded to the UNIX server, then all signals will not be thread specific. Because of the interrupting nature of signals, we can't do anything in the real sigvec signal handler that needs a lock. In other words, we can't spawn the handler thread directly. Instead, we send a message to the exception handler thread that tells it to spawn the handler thread. Under SunOS, there are no exceptions, just signals. So we have to decide on a signal by signal basis if it is thread specific or not. Well, we would like to be able to tell kernel generated signals from kill(2) generated signals, but that might not be possible. Anyway, as there is really only one thread, we can just use sigblock/sigsetmask to keep from context switching at bad times. ;;;; Garbage collection support. The garbage collector needs to suspend all other threads. The best way to do that depends on the threads package being used. So let the thread support also include the code to invoke the garbage collector. Under MACH: Have a GC thread that: while (1) { msg_rcv /* Wait for someone to send us a message */ sigblock mutator_suspend all real threads. do the GC mutator_resume all real threads. msg_send /* send a reply message */ sigsetmask } and when someone wants to invoke a GC they just msg_rpc to the GC thread. Under UNIX: Just wrap a call to the garbage collector with calls to sigblock/sigsetmask. That way the schedular is blocked out. ;;;; Save/Restore. Under MACH: Save/Restore represents a significant problem for multi-threaded applications. If we use blocking on a msg_rcv to put threads to sleep, then it will be very difficult to restore these threads, as the port will have changed out from underneath us. If we use the thread pool idea above, then all we have to do is mutator_suspend all the threads, write the info to disk, and then mutator_resume all the threads in the restored core. Under UNIX: Easy: there is only one thread. So just save_state(do_save) and have do_save record the saved nsp along with the state for all the other threads. ;;;; Debugger. When a thread hits an internal error, that thread itself needs to call ERROR. So first of all, make sure the internal error signal or exception marks the thread as suspended. Then have a facility for faking the setup of a function call on a suspended thread. This facility could even be used to start the thread originally.