Return to the lecture notes index
Reading
Read Chapters 4
Polling vs. Interrupts
Generally speaking, devices are unable to instantaneously respond to instructions. Instead, devices are given instructions, act on them over time, and then respond when they are done. Modern operating systems often try to perform other jobs while one job is blocked waiting for I/O. In order for this scheme to work, it is necessary for the devices to have some way of letting the system know that they need attention.One way of handling this situation is for the software to periodically poll the device and ask "are you done, yet?" This approach does work, but unfortunately many cycles are wasted as the software naggs the hardware.
Another approach, which is commonplace in modern systems, requires hardware support. This system makes use of hardware interrupts. Imagine a system by which each device has a wire attached directly to the CPU called an Interrupt Request Line (IRQ). When a device wants attention, it can apply a voltage to this line. This voltage is then sensed by the CPU, allowing it to take appropriate action.
But wait! The CPU just follows instructions. The Program Counter (PC) just moves from one instruction to the next instruction, unless directed otherwise via some type fo branch, right? Well, generally speaking, yes. But in this case, the CPU and the devices are doing some under-the-table dealing via something called the Interrupt Vector Table. The interrupt vector table is an array of function pointers located at a predetermined address in memory. This address may be hard wired, stored in a register initialized at boot, or perhaps the entire table is stored in registers within the CPU, itself).
At boot time, the addresses for special functions known as Interrupt Service Routines (ISRs) or handlers are stored into this array. Each function is the piece of the operating system responsible for taking the right action in response to a device's request for attention.
Each of the wires that connects the device to the CPU has a number. When the interrupt occurs, the CPU knows which IRQ line is high. It is the number of this interrupt that is used as the index into the interrupt vector. basically, when an interrupt goes off, the interrupt number is used as an index into the interrupt vector. The address in the associated element is dereferenced, executing the appropriate service routine. When the routine is done, it restores the CPU to its previous state and the CPU picks up where it left off.
![]()
Well, I have to confess that the world is actually a little bit more complicated that I've alleged so far. There's actually another piece of logic involved called the interrupt arbitrator or interrupt controller. What does it do? Well, it is basically the broker that comes between the individual devices and the CPU. The CPU is (arguably) the most valuable, most contended rsource in the system -- and like most valuable resources, it has an administrative assistant to control interuptions. The interrupt arbitrator is exactly that administrative assistant. Much like the boss's secretary comes between the underlings and the boss, the arbitrator comes between the devices' interupt requests and the CPU.
Each device's IRQ lines are connected to the aribitrator which is connected via one interrupt line to the CPU. When a device requests attention, the interrupt is received by the interrupt arbitrator. If it is okay, the arbitrator interrupts the CPU, which in turn executes the ISR as described above.
Well how does the arbitrator know whether or not it is okay to disturb the boss? Well, like a good administrative assistant, it carefully follow's the boss's instructions. In order to decide whether or not a particular device should get the CPUs attention, it is helpful to know the relative importance of the devices. For this reason the number associated with each device's IRQ line is often used as a priority -- the lower the number, the more important the device. The arbitrator has a register set by the CPU that stores the interrupt level. The arbitrator will only interupt the CPU if the device's priority is equal to or greater than the value stored in this register. In other words if this register holds the value 5, we say that the interrupt level is 5. If this is the case, the CPU will only be interrupted by interrupts 0-5.
Well, what about the other devices? What if they need attention, but the CPU is doing something more important? Well, there is another register that contains one bit for each interrupt. Each time an interrupt occurs, the bit for this interrupt is set. This bit is cleared by the CPU when it services the interrupt. This register allows the CPU to discover what interrupts it may have missed. It is important to note that there is only one bit of information available about each interrupt -- not a queue. If the same interrupt occurs more than once, this is not known to the CPU. The only thing that it can discover is that the interrupt occured at least once.
There is one more register typically found in an interrupt arbitrator, the interrupt mask. This register holds one bit for each interrupt. The interrupt mask is ANDed with the the interrupt register discussed above. If the bit is not set in the interrupt mask, the CPU will not see that interrupt. The interrupt mask allows the CPU to temporarily (or permanantly) ignore certain devices, independent of their priority.
In this way, the CPU can interact with the devices in an orderly way.
![]()
The Timer
Consider this: When the CPU is executing a particular program, the PC is moving through that program from one instruction to the next, unless the program, through a branch, instructs it to do something else. In a time sharing environment, for example, the OS might want to regularly switch among tasks in order to appear responsive to all users. But if one program is running, how is the OS to get control?
One approach to this is for all programs to regularly invoke the operating system by making a system call. In this way, programs could periodically yield to the OS, allowing the OS to dispatch another program. Windows 3.x used this approach caled non-preemptive multitasking. Programmers would periodically include the yield() call in their code. This would invoke the OS's scheduler, which might schedule another task to run.
But this approach was far from perfect. One problem involves a program that "runs away" executing some loop endlessly -- the OS can never get control to let anything else run. The other problem is that the more frequently a program yielded to the OS, the less cycles it might get. In the competitive environment in which software is written, some programmers might be tempted to yield less frequently (of course, this could be punished by the scheduler when they finally did yield).
Do you remember before when mentioned the timer device? I told you it was special and that we'd talk about it later. Modern computers include hardware timers, becuase they give the OS another way of getting control of the CPU. The timer device is basically a count-back timer, much like one would you might use to measure a cooking time (like an hour-glass egg time), or to enforce a time limit for a sporting event. The timer has a register that is initiallized, usually at boot time, to a particular value. It counts back from this value to 0 at which time it interrupts the CPU. The interval between timer interrupts is known as a time quantum. A typical quantum might be about 10mS.
Using this approach, a particular job is left to run for up to one quantum, at whcih time the timer interrupt goes off and the OS's scheduler is invoked via the timer's ISR. At this time the same job, or another job might be dispatched to the CPU.
It is important to note that jobs don't always execute for an entire quantum without interruption. Sometimes they perform I/O or do something else that temporarily prevents them from making use of the CPU. Often times libraries are written such that the call yields immediately after dispatching such a request, often called a blocking operation. We'll talk about this more soon.
Context Switching
In most cases, processes aren't aware that they are sharing the CPU and other system resources with other processes. Instead, much like the theater, the set has to be torn down and reassembled in between each scene. This takes much work and wastes many cycles. Consider the fact that each process has its own understanding of memory and has different values stored in registers. Consider the affect that switching tasks has on caches. We'll talk more about all of the accounting later, but for now, trust me -- it is expensive.So why would the OS alternate among processes instead of letting one run until completion? One answer might be time sharing to allow several processes to interact with users, without forcing the users to conform to the computer's schedule (the alternative was getting in line to use the computer -- a common scene in yesteryear). Another reason might be to make use of CPU cycles that would be wasted by a process waiting for an device to complete a request.
Protection
At this point let me point out that with several different users sharing the same resources, the operating system needs to act a bit like a police officer and keep the order. It would be an interesting world if students could read my gradebook on the andrew system and see each others grades -- or if faculty and staff could see each other's salaries, &c. But the OS is a piece of software, like any other, so how can it do this?The answer, as was the case for preemptive scheduling, is that the OS needs some level of support from the hardware. Hardware can enforce limits on what memory addresses a particular program can access, &c. The OS can change these values by using privleged instructions. These instructions may be incorporated into ISRs, or they may be invoked via special instructions called traps. In either case, the hardware verifies that the user is in fact the OS (or other privleged process) before allowing the instruction. If this isn't the case, an exception occurs. Exceptions are often handled in a way similar to interrupts, via a service routine.
Memory Hierarchy
The last thing I'd like you to remember from a prior class is the storage hierarchy. Computers contain several different types of memory. Faster memory is often more expensive, so less of it is available. Slower memory is often more plentifully available. It is the goal of the system to use faster memory as often as possible. To achieve this goal, the system tries to keep the items that will be used sooner in faster memory. Often times this involves using some policy to estimate what is likely to be used next -- one common policy is to kee the most recently used items in the fastest memory, assuming that they are most likely to be used again soon.There are several different types of meory in a system. As we move down in this list, we are moving to slower memories that are typically more plentifully available because fo a lower cost per unit:
- registers -- very small units of memory built into the CPU itself that operate at the same speed as the CPU
- L1 cache -- memory that is slightly slower than the CPU that is typically separate from the CPU but part of the same package.
- L2 cache -- memory that is slower than the L1 cache, but faster than main memory. It is usually not part of the CPU's package, but it can be.
- Main memory -- The slowest RAM in the system -- but much faster than the next level.
- disk -- once used only for "external" or "offline" storage, the existence of demand paging and demand segmentation has turned disk into a massive, but slow secondary RAM. It is also often used for non-volatile storage in portable devices.
Typically, the program manages the registers through instructions that direct the CPU when to load and store values into/out of RAM. The caches are usually managed by hardware that uses a policy such as LRU with write-back or write-through to decide when to read or write values into main memory. Caches are invisible from the software side of things, except for the impact of misses on performance -- and the possible need to flush them upon context-switch. The operating system is generally responsible for the movement of information between main memory and disk -- consequently, it is this part of the hierarchy that we will study in the greatest depth this semester.
Abstraction
We'll hear about many abstractions this semester -- we'll spend a great deal of time discussing various abstractions and how to model them in software. So what is an abstraction?Again, a very good defintion was provided by a student very early in the discussion:
An abstraction is a representation of something that incorporates the essential or relevent properties, while neglecting the irrelevant details.I think this is a very good defintion. Throughout this semester, we'll often consider something that exists in the real world and then distill it to those properties that areof concern to us. We'll often then take those properties and represent them as data structures and algorithms that that represent the "real world" items within our software systems.
The Task
The first abstraction that we'll consider is arguably the most important -- a represention of the work that the system will do on behalf of a user (or, perhpas, itself). I've used a lot of different words to describe this so far: task, job, process, &c. But I've never been very specific about what I've meant -- to be honest, I've been a bit sloppy.
This abstraction is typically called a task. In a slightly different form, it is known as a process. We'll discuss the difference when we discuss threads. The short version of the difference is that a task is an abstraction that represents the instance of a program in execution, whereas a process is a particular type fo task with only one thread of control. But, for now, let's not worry about the difference.
If we say that a task is an instance of a program in execution, what do we mean? What is an instance? What is a program? What do we mean by execution?
A program is a specification. It contains defintions of what type fo data is stored, how it can be accessed, and a set of instructions that tells the computer how to accomplish something useful. If we think of the program as a specification, much like a C++ class, we can think of the task as an instance of that class -- much like an object built from the specification provided by the program.
So, what do we mean by "in execution?" We mean that the task is a real "object" not a "class." Most importantly, the task has state associated with it -- it is in the process of doing something or changing somehow. Hundreds of tasks may be instances of the same program, yet they might behave very differently. This happens because the tasks were exposed to different stimuli and their changed accordingly.
Representing a Task in Software
How do we represent a task within the context of an operating system? We build a data structure, sometimes known as a task_struct or (for processes) a Process Control Block (PCB) that contains all of the information our OS needs about the state of the task. This includes, among many other things:
- content of registers (like the PC)
- content of the stack
- memory pages/segments
- open files
When a context switch occurs, it is this information that needs to be saved and restored to change the executing process.
Task State
Just like people, tasks have lifestyles. They aren't always running and they don't live forever. Typical UNIX systems view tasks as existing in one of several states:
- New: Recently created
- Ready: Ready to run, but not yet assigned to a processor
- Running: Actually executing on a processor
- Waiting: Not currently executing. Not currently able to execute. Won't be runnable until some specific external event occurs, such as an I/O operation.
- Terminated: Done executing, won't accomplish anything else useful or again be placed on a CPU.
![]()
A task moves from the new state to ready state after it is created. Once this happens, we say that the task is "admitted."
After the scheduler selects a task and assigns it to a processor, we say that the task has been "dispatched."
When a task is done, it "exits." It is then in the terminated state.
If a task is waiting for an event, such as a disk read to complete, it can "block" itself yielding the CPU. It is then in the "wait" state. The system has many different wait queues -- not one universal wait queue -- in fact, there is one wiat queue for each possible reason to wait. This is because it would be very expensive to sift through a long list each time a resource became available or other event occured. It is not a case of needing a list of lists, either -- since each list is associated with the event, it doesn't require any searching -- if we take care of the queue when we handle the event, we're already in the right place.
After the event occurs, the operating system can move it to the "ready" state.
After a task has exhausted its time slice, it can be moved into the ready state to allow another task access to the processor.
Please pay careful attention. The operating system is responsible for creating tasks, dispatching them, readying them after an event, and interrupting them after their time expires. Tasks must exit and block voluntarily.
Creating New Tasks
One of the functions of the operating system is to provide a mechanism for existing tasks to create new tasks. When this happens, we call the original task the parent. The new task is called the child. It is possible for one task to have many children. In fact, even the children can have children.In UNIX, child tasks can either share resources with the parent or obtain new resources. But existing resources are not partitioned.
In UNIX when a new task is created, the child is a clone of the parent. The new task can either continue to execute with a copy of the parent image, or load another image. Well talk more about this soon, when we talk about the fork() and exec-family() of calls.
After a new task is created, the parent may either wait for the child to finish or continue and execute concurrently (real or imaginary) with the child.
Task Termination
A child may end as the result of the normal completion, it may be terminated by the operating system for "breaking the rules", or it might be killed by the parent. Often times parents will kill their children before they themselves exit, or when their function is no longer required.
In UNIX, when a task terminates, it enters the defunct state. It remains in this state until the parent recognizes the fact that it has ended via the wait-family() of calls. Although a defuct task has given up most of its resources, much of the state information is preserved so that the parent can find out the circumstances of the child's death.
In UNIX children can outlive their parents. When this happens, there is a small complication. The parent is not around to acknowlege the child's death. A dead process is known as a zombie if its parent has already died. The init process waits for all zombies, allowing for them to have a proper burial. Sometimes zobies are known as orphans.
Fork -- A traditional implementation
fork() is the system call that is used to create a new task on UNIX systems. In a traditional implementation, it creates a new task by making a nearly exact copy of the parent. Why nearly exact? Some things don't make sense to be duplicated exactly, the ID number, for example.The fork() call returns the ID of the child process in the parent and 0 in the child. Other than this type of subtle differences, the two tasks are very much alike. Execution picks up at the same point in both.
If execution picks up at the same point in both, how can fork() return something different in each? The answer is very straightforward. The stack is duplicated and a different value is placed on top of each. (If you don't remeber what the stack is, don't worry, we'll talk about it soon -- just realize that the return value is different).
The difference in the return value of the fork() is very significant. Most programmers check the result of the fork in order to determine whether they are currently the child or parent. Very often the child and parent to very different things.
The Exec-family() of calls
Since the child will often serve a very different purpose that its parent, it is often useful to replace the child's memory space, that was cloned form the parent, with that of another program. By replace, I am referring to the following process:Fork w/copy-on-write
- Deallocate the process' memory space (memory pages, stack, etc).
- Allocate new resources
- Fill these resources with the state of a new process.
- (Some of the parent's state is preserved, the group id, interrupt mask, and a few other items.)
Copying all of the pages of memory associated with a process is a very expensive thing to do. It is even more expensive considering that very often the first act of the child is to deallocate this recently created space.One alternative to a traditional fork implementation is called copy-on-write. the details of this mechanism won't be completely clear until we study memory management, but we can get the flavor now.
The basic idea is that we mark all of the parent's memory pages as read-only, instead of duplicating them. If either the parent or any child try to write to one of these read-only pages, a page-fault occurs. At this point, a new copy of the page is created for the writing process. This adds some overhead to page accesses, but saves us the cost of unnecessarly copying pages.
vfork()
Another alternative is also available -- vfork(). vfork is even faster, but can also be dangerous in the worng hands. With vfork(), we do not duplicate or mark the parent's pages, we simply loan them, and the stack frame to the child process. During this time, the parent remains blocked (it can't use the pages). The dangerous part is this: any changes the child makes will be seen by the aprent process.vfork() is most useful when it is immediately followed by an exec_(). This is because an exec() will create a completely new process-space, anyway. There is no reason to create a new task space for the child, just to have it throw it away as part of an exec(). Instead, we can loan it the parent's space long enough for it to get started (exec'd).
Although there are several (4) different functions in the exec-family, the only difference is the way they are parameterizes; under-the-hood, they all work identically (and are often one).
After a new task is created, the parent will often want to wait for it (and any siblings) to finish. We discussed the defunct and zombie states last class. The wait-family of calls is used for this purpose.
Fork-Exec Example
Click here for a simple example of a fork-exec program.