What is an operating system?
An operating system is a system that separates the idiosyncratic details of the hardware from the user or users. It attempts to provide an environment in which the user can easily execute programs in a convenient and efficient manner by relieving them of the trouble of managing processes and resources.
What services does an operating system provide?
Operating systems run programs, coordinate I/O activity, manage the file system, provide communications between processes, detect errors, allocate resources, provide protection of information resources and processes, and keep accounting information.
What is the difference between buffering and spooling?
Buffering is a method of overlapping the computation of a job with its execution. It temporarily stores input or output data in an attempt to better match the speeds of two devices such as a fast CPU and a slow disk drive. If, for example, the CPU writes information to the buffer, it can continue in its computation while the disk drive stores the information.
With spooling, the disk is used as a very large buffer. Usually complete jobs are queued on disk to be completed later. A typical example is the spooler for a printer. When a print job is issued, the spooler takes care of it, sending it to the printer if it is not busy, or storing it on disk otherwise.
The main difference between buffering and spooling is that the latter allows the I/O of one job to overlap the computation of another. Buffering only allows the I/O of a job to overlap with its own computation.
How is buffering different from caching?
With buffering, the data is accessed in a serial fashion in one direction (read or write), and each piece of the data is accessed only once. In contrast, caching makes a copy of the data and allows multiple reads and writes to it in a non-sequential order.
(Winter 1999) What is double buffering?
Double buffering is a hardware method typically used in video displays. Double buffering means having two sets of image buffers, one front visible buffer and another back non-visible buffer. It allows one image to be drawn while the other is displayed, thereby increasing animation quality.
(Winter 1999) Describe everything that happens on a getchar() from a disk file. Where there is buffering/caching, specify the possibilities. Also indicate where things are happening.
In the C runtime library:
In the kernel:
What is the degree of multiprogramming of a system?
The degree of multiprogramming is the number of processes in memory and competing for the CPU. This is controlled by the medium and long term schedulers.
The concept of multiprogramming did not become useful (in the sense of performance) until the DMA channel was developed. Why is this true?
DMA is used to relieve the CPU of the trouble of transferring data from I/O devices. Buffers, pointers and counters are set for the device, and then the device controller transfers an entire block of data to or from its buffer to main memory. After the transfer is complete, and interrupt is sent to the CPU.
Before DMA, the CPU was responsible for I/O transfer of data. With the advent of the DMA channel, the CPU could spend more time executing other processes while the DMA controller coordinated the transfer. The end result would be one interrupt per block, instead of an interrupt after every word or byte.
Explain the difference between multiprogramming and time sharing.
Multiprogramming allocates the CPU based on the expected CPU burst time for the processes that are ready to run. Time sharing is used on interactive systems to provide the processes the illusion of a slower computer. Basically, time sharing uses CPU scheduling and multiprogramming to give each interactive process a portion of the CPU time.
What is the difference between a program and a process?
A program is a static representation of a computation to be performed. In contrast, a process is an active entity that contains state in the form of the program counter, registers, open files, memory, etc. -- a program in execution.
What about a thread and a task in relation to a process?
A thread is a basic unit of CPU utilization that has very little data that is not shared. (Only a program counter, a register set, and stack space.) Usually several threads share code, address space and operating system resources. The benefits of threads are that there is little setup time, and threads may be blocked without halting the overall execution of the main process. On the other hand, there is no protection between threads, and some coordination between them may be necessary. The environment in which these threads execute is called a task. A process is equivalent to a task with one thread.
What data do threads not share?
Each thread has its own register state (including the program counter), as well as its own stack (neither of which is protected within the process).
Why would one want to use threads?
Threads provide an inexpensive means of sharing the CPU and creating multiple computational entities. Since they do not contain very much of their own data, switching between them does not involve the same cost as does a context switch with traditional processes. A problem with this approach is that threads would require some sort of synchronization because they share an address space. Another benefit is better utilization on multiprocessor architectures.
(Winter 2000) With regard to an I/O intensive multithreaded program, which approach would you recommend? Kernel-level threads or user-level threads? Assume I/O is blocking.
Definitely kernel-level threads, since in a user-level threads model, the whole process would be stalled for each I/O call, defeating concurrency. This comes from the fact that the base unit of scheduling with respect to the kernel would be the whole process, not the thread.
What is a PCB?
A PCB is a Process Control Block. It contains the information associated with
a process that differentiates it from the others, such as:
What is is it used for?
Along with the data in memory and on disk, the PCB holds the state of a running program. This allows the operating system to save the information and halt the process during interrupts and context switches.
Define the following: utilization, throughput, turnaround time, waiting time, response time.
What is the convoy effect?
The convoy effect occurs in a FCFS scheduling system when a large CPU-bound process is using the CPU and many smaller I/O-bound processes are waiting. When the CPU-bound process relinquishes control and performs I/O, the other processes quickly finish their computations and enter the I/O queue. Then the CPU-bound process completes its I/O and regains control of the CPU, after which the other processes complete their I/O and reenter the CPU queue. Better resource utilization would occur if the I/O-bound processes were allowed to use the CPU first.
What does it mean to be preemptive?
One usually hears the term preemptive in the context of scheduling algorithms. It means that a job that is currently being processed can be interrupted by a higher priority task or an interrupt and either aborted or resumed later.
What is a problem with priority scheduling?
With priority scheduling, it is possible for a process with a low priority to never be allocated time for the CPU. This is called starvation.
How could one solve this problem?
If one increased the priority of a process based on the amount of time that it had been waiting (called aging), it would eventually reach a high enough priority that it would be able to execute.
(Winter 1996) What is Shortest Job First?
It is a scheduling algorithm in which the next job to be executed is selected from the available pool based on its predicted CPU burst time. (Shortest first.) It is a special case of the general priority scheduling method, with the priority being based on expected CPU time.
Round Robin?
With round robin, each process in turn is given a predefined set of time to execute. If the process completes its computation in less than the time allocated, it relinquishes the CPU. Otherwise, the CPU is taken from it when the time expires. The problem with this method is that the average waiting time can be quite long, and that one also pays the overhead cost of context switches.
Multilevel Queue?
With multilevel queue scheduling, processes are assigned to one of several queues, each of which has its own scheduling algorithm. (An example might be foreground and background queues.) Between the queues there is also a scheduling method, possibly round robin or priority.
Multilevel Feedback Queue?
A multilevel feedback queue is a multilevel queue in which processes can be migrated from one queue to another based on their CPU burst characteristics.
(Winter 1996) In what way is shortest job first optimal?
It minimizes the average waiting time.
Sketch a proof that shortest job first is optimal.
Shortest job first is an optimal non-preemptive scheduling algorithm. To prove that it is optimal, assume that it is not and try to reach a contradiction. (Optimal means lowest average waiting time.)
Let the function RT(P) determine the running time of process P. Assume that we have a schedule for n processes that is optimal, and for which a process Pj is scheduled before Pk, and RT(Pj) > RT(Pk):

The average waiting time for this schedule will be the sum of the waiting times of all the processes divided by the number of processes n. If we swap the two processes Pj and Pk, then the waiting times for the processes before Pk and after Pj will remain the same. Pk's waiting time will decrease more than Pj's will increase, since RT(Pj) > RT(Pk). Furthermore, the waiting time of the processes scheduled between Pk and Pj will decrease.

The result will be that the total waiting time will decrease, and thus the average waiting time will be less. This is a contradiction of our assumption that there is an optimal schedule that is not shortest job first. Therefore, the shortest job first algorithm provides the optimal non-preemptive schedule.
(Winter 1996) Name a problem with shortest job first.
Starvation. There is the chance that a process that has a large CPU burst will not have an opportunity to execute.
(Spring 1994) What is a critical section?
A critical section is a region of code in which a process needs exclusive access to a shared object such as a variable or file.
What properties characterize a solution to the critical section problem?
What is the producer-consumer problem?
A producer produces data that is stored in a (possibly unbounded) set of buffers. The consumer consumes the data in a buffer and marks the buffer as empty. In the bounded buffer situation, the producer can only put data into empty buffers, and may have to wait if none are available. Likewise, the consumer must wait if no data are in the buffers.
Outline a solution to the problem.
The processes share an array of n buffers, as well as a pair of integers in and out. The producer can fill a buffer only if it is not the buffer before the in buffer (modulo n). The consumer can not empty a buffer if in equals out.
What is the Bakery Algorithm, and how does it work?
It is an algorithm or mutual exclusion in which a process wanting a resource is assigned a number higher than than the maximum of the other waiting processes. When a process wishes to use the resource, it cycles through the other processes looking for a processes with lower numbers. If any are found, the process waits until the other ones complete their work, then proceeds.
There is also the detail that during the time when a process is choosing a number, other processes must wait until it is assigned one before comparing numbers. One can implement this using an array with a boolean value choosing for each contending process.
(Spring 1995) How do you implement mutual exclusion primitives such as locks and semaphores?
When a user executes a primitive operation, a software exception is raised (a trap), and the OS takes control by changing the operating mode from user to monitor. The OS then runs without interrupts, executing the necessary code to service the request, such as managing semaphore queues. When the operation is completed, the mode is switched back to user and control is returned to the original process.
What is dual-mode operation?
Dual-mode operation is a means of providing security for an operating system from users. A bit is used to select either user mode or monitor (also called supervisor, system, or privileged) mode. If a user tries to execute a privileged instruction, it is trapped into the operating system as an error.
(Spring 1995, Spring 1996) What is a semaphore?
It is an abstract data type used for synchronization, consisting of a shared integer variable as well as two atomic operations P and V, which respectively cause a process to wait if the resource is being used, and to signal waiting processes when the resource is released.
(Spring 1995, Spring 1996, Winter 1999) Give an example of a counting semaphore that busy waits. What is a problem with this implementation?
Here is a counting semaphore, in which we use a shared variable initialized to the number of allowed concurrent processes. (The ++ and -- operators are assumed to be atomic.)
integer count;
P() {
boolean done;
done = FALSE;
while (!done) do {
<Disable Interrupts>
if (count > 0) {
count--;
done = TRUE;
}
<Enable Interrupts>
}
}
V() {
count++;
}
The problem with this implementation is that busy waiting is very inefficient in terms of CPU allocation. We would be better off using those cycles on another process that is doing useful work.
Now give an implementation of P() and V() that does not busy wait.
For this implementation, we will another shared variable, a first-in-first-out queue of waiting processes.
integer count;
FIFO_list queue;
P() {
count--;
<Disable Interrupts>
if (count <= 0) {
AddToQueue(pid);
Sleep();
}
<Enable Interrupts>
}
V() {
PID ProcessToWakeup;
count++;
<Disable Interrupts>
if (count <= 0) {
ProcessToWakeup = RemoveFromQueue();
}
<Enable Interrupts>
Wakeup (ProcessToWakeup);
}
Wait, if you disable interrupts, then put yourself to sleep, how will you be wakened?
Well, it turns out that the interrupt flag is part of the program status word, which gets swapped out when sleep is called. The OS then modifies the interrupt flag when it swaps another process in.
(Spring 1995, Spring 1996, Winter 1997) Why would one not typically use such semaphores on a multiprocessor system, and how would you fix your semaphore implementation for such an application?
On a multiprocessor system it is not possible to insure mutual exclusion by simply disabling interrupts. One would need to use one of the critical section solutions such as spin locks, a test-and-set on shared memory, or the Bakery Algorithm.
(Winter 1997) Is there a method that we can use that does not busy wait?
If there is shared memory, then you can store a queue of waiting processes, as well as the semaphore variable. Instead of busy waiting, a process can enqueue itself and then put itself to sleep. On a V() operation, a process is removed from the queue and awoken.
What is a test-and-set?
A test and set is an atomic operation that assigns the value "true" to a boolean variable and returns the variable's previous value. This is typically supported by the instruction set architecture.
Why is it useful?
It allows one to easily implement a lock for a critical section like so:bool lock;
while !Test-and-Set(lock) {}
<Critical Section>
lock = FALSE;
Likewise, one can implement binary semaphores without disabling interrupts.
Show how a test and set can be simulated given an atomic swap operation.
bool new,old; new = TRUE; repeat Atomic-Swap(new,old); until (new == FALSE); <Critical Section> old = FALSE;
What is a main problem users face when using semaphores?
Since there are no controls over when the semaphore P and V operations are called, they must be tightly controlled by the processes using the semaphore to make sure that, for example, nobody calls a P without the corresponding V.
How do monitors address this problem?
Monitors encapsulate the resource, thus shielding the user from explicitly having to call synchronization operations.
What exactly is a monitor?
A monitor is an abstract data type that encapsulates a shared resource. The user defines operations that are valid, and only one operation may be executing in the monitor at a time. Within the monitor a special kind of blocking semaphore called a condition variable can be used to synchronize processes with its wait() and signal() operations. Monitors make the writing of concurrent programs easier since they remove some of the danger of using semaphores directly.
(Winter 1999) What is the difference between a semaphore and a monitor?
Although semaphores are powerful, they are rather "low level" and error-prone. A slight error in placement of semaphores can lead to big problems. It is also easy to forget to protect shared variables with a mutex semaphore. A better (higher-level) solution is provided by the monitor.
The monitor is a synchronization mechanism based on data abstraction. The shared data is encapsulated in routines, and automatic mutual exclusion is provided on these routines; only one process may execute a procedure of a monitor at any time. Monitors also have a queuing mechanism for waiting on conditions.
(Winter 1998) How can you implement a counting semaphore using only binary semaphores?
First we need a counting variable, and a semaphore to protect it from being modified in an uncontrolled manner. Likewise we need a semaphore to perform the wait on when the count is less than or equal to 0. Lastly, we need another semaphore to insure sequential execution of our wait() code.
BinarySemaphore S1; // Insures only one process executes wait() at once
BinarySemaphore S2; // Protects C, the counter
BinarySemaphore S3; // The "real" semaphore that processes will wait on.
integer C; // The counter, initialized to some value > 0;
wait() {
wait(S1);
wait(S2);
C--;
if (C < 0) {
signal(S2);
wait(S3);
} else
signal(S2);
signal(S1);
}
signal() {
wait(S2);
C++;
if (C <= 0)
signal(S3);
signal(S2);
}
(Winter 1999) What is a deadlock, and what are the conditions for deadlock?
A deadlock is a situation in which two or more processes are all waiting for an event that only the other processes can cause. A classic example is resource deadlock for two processes, a printer, and the disk. One process holds the printer, and the other holds the disk, and they both wait for the other to release their resource.
The four conditions for deadlock are:
What is starvation, and how is it different from deadlock?
Starvation is a situation in which a process may be indefinitely blocked. In such a situation, it is possible that the process could be executed, whereas it is impossible in deadlock for any process to be executed.
Can a system detect if a process is starving?
Yes. The system could store the time when the process begins waiting for one or more resources, and then periodically check the times of all waiting processes. If the wait time exceeds some threshold, the system can assume that the process is starving, and then give it a higher priority, allow it to preempt other processes, etc.
What is the Dining Philosophers problem?
It is a canonical process synchronization problem postulated by Dijkstra.
There are five chairs at a table and one fork between each setting on the table. Philosopher, who alternately think and eat, are sitting at the table. When a philosopher wishes to eat, he picks up the forks to his left and right, if possible, and eats uninterrupted. Otherwise he picks up any forks that he can and waits for the forks or other fork to become available.
The philosophers can become deadlocked if all of them pick up the fork from the same side at the same time.
What are some methods for preventing deadlock for this problem?
What is the Readers-Writers problem?
Several readers and writers are accessing a set of shared data. Multiple readers can read the data at the same time, but writers must have exclusive access. One variant to this problem is "Readers Priority", where the readers get access to the data before writers. Similarly, "Writers Priority" forces readers to wait until all writers have completed their work.
What is the standard protocol for determining if a message has been lost?
A timeout. If a process sends a message and does not receive a reply within the timeout interval, it assumes that the message was lost.
What is the benefit of the mailbox method of message passing?
It allows processes to refer to each other indirectly instead of naming them explicitly, which allows processes to be renamed without having to recompile every program that refers to them.
What are the downsides to the mailbox method?
One possible downside is that the operating system must provide mechanisms to create, destroy, and use mailboxes. In addition, there may be a performance loss due to the extra level of indirection. Lastly, if the message passing scheme copies the data, then multiple copies of large amounts of data to many mailboxes can be time and space consuming. (Mach circumvents this by providing a reference to the data in the callers address space.)
What is reentrant code?
Reentrant code, also called "pure code" was often mistakenly thought to simply be code that does not modify itself. In reality, we must also be sure that any variables changed by the routine are allocated in separate locations in memory for each invocation. In general, A function is reentrant if, while it is being executed, it can be re-invoked by itself, or by any other routine, by interrupting the present execution for a while.
Originally reentrant code was used in mainframes where the amount of memory was limited, and system designers wanted to be able to share the same code among many different users instead of allocating space for each execution.
What is the difference between short, medium, and long-term scheduling?
Sometimes there are more jobs that need to be executed than can be executed at once, and these jobs are usually spooled to disk. In such cases, the long term scheduler selects a job to run from the spooled jobs. The short term scheduler decides which of the currently running processes gets control of the CPU. The medium term scheduler on some systems removes jobs from memory and places them on disk, to be continued at a later time. (This is called swapping.)
(Winter 1997) What is Little's formula?
Little's result is used in queuing theory to determine the average length of the queue. if n is the average length of the queue, W is the average waiting time in the queue, and lambda is the average arrival rate, then n is given by
What is the difference between deadlock prevention and deadlock avoidance?
In prevention, the operating system prevents one or more of the conditions for deadlock from occurring. With avoidance, processes tell the operating system their maximum resource requirements, and an avoidance algorithm grants the requested resources if there exists some sequence of process execution that will allow all processes to complete. A state is safe if such a sequence exists.
(Winter 1999) How can you prevent the conditions for deadlock?
Mutual exclusion can not be compromised. We can break hold and wait by either requiring that all processes acquire resources before executing, or by only allowing resources to be requested when a process has none. (Unfortunately, both of these can lead to low resource utilization and starvation.)
No preemption can be circumvented by forcing processes who can not allocate all of their resources to implicitly release the ones they've acquired. Likewise, we can allow processes to preempt other processes who have resources that are needed and are also waiting for more resources. (These policies work best for situations in which the state of the shared resource can be easily saved and restored.)
Circular wait can be broken by creating a total ordering for the resources. We can then either require that processes request resources in increasing order, or that they release resources ordered higher than the one they are currently requesting.
(Winter 1999) If a system is in deadlock, what can it do?
One thing the system can do is kill one of the processes in the circular wait, repeating until the deadlock is resolved. Another is force a process to give up the resource(s) it has allocated.
(Winter 1999) What is partial and total deadlock?
Partial deadlock is when a group of processes on a system are deadlocked, but some part of the system is still able to make progress.
Total deadlock is when the system is completely unable to make progress.
Does being in an unsafe state imply deadlock?
No. Safe implies lack of deadlock, but it is possible to be in an unsafe state that does not lead to deadlock.

How can a resource allocation graph help determine deadlock?
A resource allocation graph is a graph consisting of nodes (either resources or processes) and edges (either requests or uses). A cycle in the graph implies that there may be a deadlock. For graphs in which each resource type has only one instance, a cycle implies that a deadlock has occurred.
What is a wait-for graph?
It is a specialized instance of a resource allocation graph that limits the number of instances of each resource to one. A cycle in the graph means that there exists a deadlock. In the following graph, for example, there exists a deadlock since P1 is waiting for P2, P2 is waiting on P3, and P3 is waiting on P1.

What is the Banker's algorithm?
The Banker's algorithm is a deadlock avoidance algorithm that keeps track of the available, allocated, maximum, and needed resources in the system. Before a resource request is granted, the algorithm checks to make sure that doing so will not leave the system in an unsafe state.
What problems may arise when a process is rolled back as a result of deadlock?
When you roll a process back, you are attempting to restore its state to that of a previous known "safe" state. The problem is that it is not as simple as restoring the memory, registers and program counter; the process may have written to disk, or sent messages to other processes. In order to do an effective rollback one must restore everything with which the process interacted.
What restriction does assembly time or load time binding of addresses place on the swapping of processes?
If addresses are not bound at run time, then the process must be swapped into the same memory location from which it was swapped out.
Are there any other restrictions that one must worry about when swapping processes?
Pending I/O can be a problem, since we don't want an I/O request to manipulate memory that is currently being used by another process. Similarly, I/O that operates asynchronously to the CPU might manipulate user memory. One solution is to never swap out a process performing I/O or with pending I/O requests. Another is to have I/O write to operating system buffers instead of user memory. In the latter case, the operating system would only transfer the data to the process when it is in memory.
(Spring 1994) Explain the difference between internal fragmentation and external fragmentation.
Internal fragmentation occurs when a system allocates memory in fixed-size chunks (e.g. pages). Since a process is unlikely to require exactly the amount of memory allocated, some of it will be unused. In contrast, external fragmentation is unused memory between variable-sized allocations of memory to processes. Typically this contiguous memory is too small to be useful, and therefore wasted.
Describe three ways one can address the problem of external fragmentation.
One way is to periodically do compaction of the memory space. If addresses can be bound at run time, then we can move the used memory so that the unused memory will be contiguous, and then reset the base register. Of course, one pays a cost of periodically having to shuffle large amounts of memory.
Another way to reduce the average amount of external fragmentation is to break a process into two or more smaller sections, which will increase the likelihood that small memory spaces will be used. This requires more than one base-limit pair of registers, but if we separate code and data, it would have the benefit of being able to share the code part with multiple processes.
A common way of addressing fragmentation is to loosen the requirement that process memory be contiguous. By breaking physical memory into frames and logical memory into pages, one can relocate a process' memory into many different, noncontiguous places in memory. Paging also avoids the problem of trying to fit variable-sized chunks of memory on to the backing store.
How is an address translated using paging? (Feel free to use diagrams)
See the figure below. The CPU generates a logical address, which is then separated into the page number and offset. (Typically, the first bits of the address determine how many pages there are.) The page number is used to index into the page table, which holds the physical locations of the logical pages. If the page is in memory, then the frame is combined with the offset to yield the physical address.
In a virtual memory system, the pages may also be stored on disk, in which case a page fault occurs and the page must be brought in to memory.

What's a TLB?
A Translation Look-aside Buffer is a hardware method of speeding up the translation of logical to physical memory addresses. It acts as a cache for most recent translations. When an address is generated, its page number is compared to all of the page-frame pairs in the TLB in parallel, and if the page is in the TLB, the associated frame is known. Since it is implemented in hardware, the time required to access the memory may only be 10% slower than an unmapped memory reference.
What is involved with a context switch?
A context switch occurs when a process is interrupted and another process gains control of the system. It is important in such situations to save the state of the previously running process, which involves saving the registers, program counter, page table or segment pointers, stack, etc.
How does segmentation compare to paging?
With segmentation, programs are compiled into separate units such as code, data, and stack. The user then specifies a segment number and offset instead of an address. This is somewhat different than paging in that the manipulation of a single address is transparent to the user in paging, but with segmentation the user must provide two quantities for each memory reference.
A segment table is used to translate segment numbers to physical addresses in a manner similar to a page table. Every segment number and offset pair is translated into a base address and offset. One difference between this method and paging is that segments can have variable lengths, which necessitates a field in the segment table containing the limit of the offset value. Trying to offset past the limit of the segment causes an addressing error.
Like paging, a TLB can be used to lessen the cost of the extra memory reference.
What is paged segmentation?
With paged segmentation, the segments each have a page table whose size depends on the size of the segment. With this method, allocating space does not require the large search time of traditional segmentation, but there is the cost of an additional memory reference before the data is found. The reason for this is that there is one access to the segment table in order to get the base of the page table, then another access to get the physical page associated with the segment. (The segment offset is broken down into a page number and page offset.)
What property must code have in able to be shared?
It must be reentrant.
Why is it easier to share a reentrant module using segmentation than pure paging?
With segmentation, sharable code can be divided into segments, and the processes sharing a segment would simply have identical segment entries in their segment tables. As a result, it is relatively easy to protect against memory access violations. With pure paging, however, the shared code and unshared code is not easily divided, which makes it more difficult to protect against unauthorized memory usage. With segmentation protection comes almost for free, but with paging every sharable section of memory would need base and limit values to define the allowable memory access range, or would need a separate page allocated.
(Spring 1994) What is virtual memory?
Virtual memory is a technique that allows the execution of processes that may not be completely in memory. It allows programmers to think of memory as a large uniform space, and even to use more memory than is physically available. Thus, there is a separation and mapping between logical memory and physical memory, which has an added benefit of providing some protection from the user's actions.
Why is virtual memory almost never used in a real-time system?
In a real-time system one is very interested in the dependability of the system. In particular, one wants the system to be predictable in the time required for it to complete its operations, as well as fast in operation. Virtual memory, while useful in terms of interfacing to the memory hardware, separates the user from the hardware and creates less predictable delays in memory-related operations.
(Spring 1994, Winter 1997) What is demand paging?
Demand paging is a method of implementing virtual memory by only bringing a page into memory when it is demanded. An extra bit in the page table marks the page as valid (in memory) or invalid (on disk). If the page is already in memory, execution continues as before. But if it is on disk, then a page fault occurs and it must be loaded into memory. Processes start without any pages allocated, so the first instruction fetch causes a page fault.
(Spring 1994) What happens on a page fault?
Refer to the illustration below. First, the page table for the process is used to determine if the page corresponding to the logical address is invalid. If it is, then the OS chooses a free frame or selects a page to be paged out, and gets the needed page from the backing store (typically on disk). The page table is then updated to reflect that the page is now in memory and the instruction is restarted.

Briefly outline how this procedure gets complicated by a TLB and an inverted page table.
The CPU generates an address. If the address translation is in the TLB, then the physical address it acquired and computation is resumed. Otherwise, check to see if it is in the inverted page table. If it is, then get it and update the TLB. If it isn't, then see if the proper page of the main page table is in memory. If it isn't then another page fault occurs, but if it is, then we can get the physical address.
The next step is to select a victim page to remove from memory. After this victim is selected, we check to see if the page is dirty, and if it is we write it to the backing store. Then we read the new page into physical memory, update the page table and TLB, and continue our previous computation.

When would you want to lock in a page?
During asynchronous I/O, a separate I/O processor may be storing information in a page, and removing that page from memory would cause problems for the I/O system.
(Spring 1994, Winter 1997) What is memory mapped I/O?
See answer in the architecture section
What are memory mapped files?
In memory mapped files, processes read and write from sequential memory locations, which is semantically simpler than the typical read(), write(), and lseek() commands. With virtual memory, it also makes accessing large files possible without having to copy them into memory, and is slightly more efficient because you don't have to go through read and write buffers. Another use is for shared memory. This is typically a service provided by the OS.
(Spring 1995) What are page replacement algorithms?
For most operating systems, there is a higher demand for memory than can be provided. One method of dealing with such a situation is to use a virtual memory system, in which process memory is broken into pages, which allows the process to be executed with only a fraction of its pages in memory. The problem, of course, it that occasionally a memory reference is made to a page that is not in memory, and the OS must have some policy for choosing which page will be removed from memory to make room for the referenced one. Such algorithms are page replacement algorithms.
(Winter 1997) What is Belady's Anomaly?
For some algorithms, increasing the number of frames available can actually cause an increase in the page fault rate.
What is a stack algorithm?
A stack algorithm is a page replacement algorithm for which the pages in memory for n frames is always a subset of the set of pages for n+1 frames. Such algorithms do not exhibit Belady's Anomaly, and the optimal and LRU algorithms are examples.
(Spring 1994) Describe two page replacement algorithms.
LRU, clock, or additional-reference-bits algorithms. See below.
(Spring 1994, Spring 1995) Why is it that no page replacement algorithm can be optimal?
The optimal algorithm is one in which the page that will not be used for the longest time is swapped out. No algorithm can be optimal because it is impossible to foresee the future page usage.
(Winter 1997) How do you implement LRU page replacement?
One method is to save with each page use a clock of some sort (whether logical or actual). When a least recently used page is needed, a comparison of all the "timestamps" on the pages can be made.
Another way to implement LRU is to have a stack containing references to all the pages. When a page is accessed, it is moved to the top of the stack. The page at the bottom of the stack is the least recently used.
(Winter 1999) Given the following accesses and three cache frames, how many misses would you have using LRU? ABCDE AEDC ABCDE
The frames after each access would look like:
A x x (miss)
A B x
(miss)
A B C (miss)
D B C (miss)
D E C (miss)
D E A (miss)
D E A
(hit)
D E A (hit)
D E C (miss)
D A C (miss)
B A C (miss)
B A C
(hit)
B D C (miss)
E D C (miss)
For a total of eleven misses.
What is the Additional-Reference-Bits algorithm.
Associate with each page a byte or word of memory, depending on the amount of past information one wishes to store. At regular intervals, shift the bits for every page toward the high order bits by one, and store in the low order bit a 1 if the page has been used, and a 0 if it hasn't. By treating the byte or word as an unsigned integer, the lowest number will be the least recently used. This is an approximation to least recently used.
(Spring 1996, Winter 1997) What is the clock algorithm?
The clock algorithm is a second chance algorithm for page replacement that approximates LRU. The available pages are viewed as a circular queue, with a pointer indicating which page is to be replaced next. When a victim page is to be chosen, the pointer advances until it finds a 0 reference bit. As it advances it unsets any set reference bits. The clock algorithm was first used in the Multics operating system. This is an even further approximation to least recently used.
(Winter 1997) How does the clock algorithm compare to the second-chance algorithm?
The second chance algorithm is a variation on FIFO, in which the oldest pages are checked first. Each page has a bit that indicates whether it has been looked at by the algorithm. Every page gets 1 "freebie" before being paged. The clock algorithm is different in that a page has its bit reset every time it is accessed, so that it may have more than two chances to stay in memory.
(Winter 1997) What happens if I increase the number of chances?
Increasing the number of chances approximates LRU more, but it takes longer to find a victim.
(Winter 1997) What is a working set?
A working set is the set of pages used by a process within a certain past interval of time. The general assumption is that pages used recently will be likely used in the near future.
(Winter 1997) What happens when you modify the parameter
? What happens in the limit?
If you increase
, more of the task's pages
will be in memory, and in the limit, all of the pages will be in memory.
Shortening
lowers the number of pages, possibly causing the system to thrash.
Describe the difference between global and local frame allocation.
With global allocation, frames are selected from the whole set of available frames, but with local frame allocation a process can only select from its own set of frames. Local allocation allows processes to control their own page fault rate, but does not allow the allocated number of frames to change. Global allocation allows greater system throughput because it can better distribute frames where needed, but can result in widely varying performance for each process.
(Winter 1997) What is thrashing?
Thrashing occurs when a process is allocated less pages than it is presently accessing. When a page fault occurs, a page is selected and removed from memory to make room for the new page. But since that page will be needed in the near future, another page fault occurs. The correspondingly high amount of I/O and resulting low CPU utilization is called thrashing. (This can be defined as occurring when more time is spent with I/O than computation.)
Thrashing can be caused when the long term scheduler increases the number of active processes in order to increase the degree of multiprogramming and CPU utilization. Unfortunately, when page faults begin to occur, CPU usage goes down, and the scheduler tries to solve the problem by adding more processes. The resulting positive feedback brings the system to a crawl.
What kind of algorithm limits thrashing?
A local replacement algorithm does not allow a thrashing process to steal frames from another process, thereby limiting the effects. However, a thrashing process will still use the paging device, causing an overall (smaller) loss in performance. This is a somewhat draconian approach, since the thrashing process will probably stop thrashing, where it would if it were allowed to steal frames.
(Spring 1994) Assume that you have the following statistics for computer:
| Disk Drive | Memory | ||
|---|---|---|---|
| Heads: | 6 | Virtual memory: | 100 MB |
| Average seek time: | 8.0 msec | Physical memory: | 20 MB |
| Average latency: | 4.2 msec | Page size: | 1024 bytes |
| Transfer time: | 5 MB/sec | ||
| Tracks: | 80 | ||
| Sectors/track: | 128 | ||
| Bytes/sector: | 1024 | ||
When does this computer start thrashing?
A computer is thrashing when it spends a lot of its time accessing the disk, and little of it doing CPU work. Let's call the maximum ratio of time spent accessing the disk r, and any higher ratio is thrashing. Assume that any time the disk isn't being accessed, the CPU is being used. We can say the computer is thrashing if
Further, page faults occur at some page fault rate pfr. For each page fault, we must transfer 1024 bytes from the disk, which takes an amount of time given by
Now any time that there is between the completion of the service for the current page fault and the next page fault is time that the CPU can execute.

This is given by
since the period of time between page faults is the inverse of the page fault rate, which equals the sum of the CPU time and the disk time. Substituting this into our original thrashing formula yields
If we say that the system is thrashing if it spends more than half of its time servicing page faults, then
What is prepaging?
Prepaging attempts to lessen the cost of page faults when a process is first started by trying to load its initial working set into memory. Likewise, a process' working set can be saved when it is swapped out and loaded back when it is swapped in.
(Winter 1997) What is an inverted page table?
Traditional virtual memory tables map each virtual page to a physical frame, which makes lookup very fast, but makes the table consist of millions of entries. An alternative is to have a table mapping each physical frame to the virtual page(s). This method results in a table whose number of entries corresponds to the number of frames.
One drawback to this approach is that we now have to use something like a hash table to do the lookup instead of doing it directly. Another problem is that we still need information on the pages that are not in the inverted page table (i.e. in physical memory), which necessitates external tables to record where on the backing store the page is. Typically these tables are themselves paged, which isn't too costly considering that they will be accessed when a page fault occurs anyway. Lastly, it is more difficult to implement shared memory with this scheme since we can not simply have two entries in the virtual page table with the same physical frame.
What is a FAT?
A FAT is a file allocation table, a simple and efficient manner of recording the used blocks on a disk. The table of disk blocks contains entries to the next disk block in the sequence for the file, and a special end-of-file value indicates the last block. Finding a starting location for a new file consists of finding the first unallocated block, indicated by a value of zero.
Describe a UNIX inode.
An inode contains the following:
(Winter 2000) What is a BSD socket?
A BSD socket is an abstraction of a network port. BSD sockets are primarily used in the UNIX world for IPC (e.g., UNIX pipes) and network connections.
(Winter 1996) How does a UNIX pipe work?
UNIX pipes were developed as an alternative to using files as a communication mechanism between programs. The first process generates output to the standard output, which is redirected to the standard input of the second program. For efficiency pipes are usually implemented as a circular buffer in which producer/consumer synchronization is used. The second program blocks if no output is available, and the first program blocks if the buffer is full.
What are safety and liveness properties?
Safety deals with getting the "right" answer from a computation, and liveness deals with the rate of progress of the computation.
What does it mean to be concurrent?
Two operations are concurrent if they can be, but need not be, executed in parallel.
What is RAID?
RAID stands for Redundant Array of Inexpensive Disks. This method is a cost-effective solution to the problem of making disk systems more fault-tolerant. There are different means of organizing the redundancy, ranging from shadowing of disk drives to bit interleaved parity, where bits are distributed across disk drives, with an extra parity bit.
What determines the total response time of a read from a disk?
Total service time = seek time + latency time + transfer time
In a naming system such as a file system, what is the difference between location transparency and location independence?
Location transparency means that the name is independent of the object's physical location. Location independence means that the location can be changed without affecting the name. (i.e. It is a dynamic mapping that is stronger in the sense that it can map the same name to two different locations at different times.)
What three characteristics are inherent with distributed systems?
Distributed systems do not have a global clock or shared memory, and are subject to unpredictable communication delays, which makes common OS tasks such as information sharing and synchronization difficult.
What is the difference between a hard real-time system and a soft real-time system?
A hard real-time system guarantees that critical tasks are completed on time, and not being able to do so is considered a system failure. The definition of a soft real-time system is somewhat undefined, except to say that it is "not hard real-time". In general, priority is given to critical tasks, but some deadlines may not be met.
What is Two-Phase Locking?
Two-Phase Locking is a concurrency control algorithm for database transactions in which locking occurs in two phases: in the growing phase a transaction requests locks without releasing any locks, and in the shrinking phase the transaction releases locks without requesting any more. Using this method decreases the average time that locks are held, thereby increasing concurrency.
Like any other dynamic locking policy, 2PL suffers from the possibility of deadlock and cascading roll-backs. A cascading roll-back occurs when a transaction failure leads to a series of transaction roll-backs.
What is an election algorithm?
Election algorithms are used by distributed systems to create a centralized coordinator. In cases where one site is used for special purposes (such as coordinating access to an input/output device), the failure of the central site requires that another site must take over its functions, and election algorithms determine who should do this.
(Winter 2000) What is an IP address?
An IP address is a unique identifier defining a machine (host or router) on the network. IP addresses are 32-bit long (IPv4) or 128-bit long (IPv6). IPv4 is currently the protocol of choice for the network layer on the Internet. Instead of using a 32-bit integer, people use the dotted decimal notation, that is, four integers ranging between 0 and 255 separated by dots.
(Winter 2000) So a router that has ten interfaces can only have one IP address, right?
No. Uniqueness means that there shall not be any IP address conflicts on the network, but a machine can have more than one IP address. Typically a router will have one IP address per interface card. In your example, the machine probably has ten different IP addresses, and possibly more (aliases, for example).
(Winter 2000) What's TCP?
TCP is a transport protocol (TCP = transport control protocol). It is one of the transport protocols in use on the Internet (along with UDP). TCP is a reliable protocol, in the sense that it deals with end-to-end connections, providing error recovery, retransmission mechanisms, and congestion control among other features.
(Winter 2000) With respect to byte ordering, what is the convention in use on the network?
The network is traditionally big-endian.
What are the layers of the OSI model?
(Winter 1999) What happens on a fork()?
A new process is created with a fork(2) system call. This call creates a child process, which is effectively a duplicate of the parent process that created it. This is implemented using a copy-on-write scheme, where data pages are copied only when a write operation is requested, thus avoiding unnecessary copying. fork copies the parent's data and stack segments (or regions) plus its environment to the child. The child shares its parent's text segment, which conserves memory space (by default, all REAL/IX Operating System programs are reentrant). Child processes initialize more quickly than the original process, since they only have to modify parts of the inherited environment rather than recreate the entire environment.
fork returns values to both the parent and child, but returns a different value to each. The parent receives the process id (PID) of the child, which can never be 0, and the child receives 0. The program uses these return values to determine which is the child process and which the parent process (since they are both executing the same program) and takes different branches in the code for each.
After checking the value returned by fork, the parent and child execute different branches of the same program in parallel. If the child needs to execute a different program, it issues an exec(2) call. exec replaces the text and data segments of its caller with those of a new program read from a file specified in the call. exec does not alter its caller's environment; a child process may execute a different program, but still have access to its parent's files, although they may have been modified by the child between the fork and the exec.
A child issues an exit(2) system call to terminate normally; this call takes a parameter whose value is returned to the child's parent. A child may also terminate abnormally by a signal issued by the kernel, a user, or another process. When a child terminates, either normally or abnormally, the operating system sends a SIGCLD signal to the parent process. Signals are discussed in more detail in Chapter 4.
(Winter 2000) What's an M/M/1 queue?
Firstly, a queue is defined by two entities: requests and servers. When we talk about an M/M/1 queue, the first M describe the model used for the requests, the second describe the servers, and the number defines the number of servers available.
M stands for Markovian, that is memoryless (any future event is independent from what happened in the past). An example of a Markovian process is the Poisson process.
Hence, an M/M/1 queue describes the following: arrivals described by a memoryless process, service described by a memoryless process as well, and only one server.
(Winter 2000) Are Poisson and Markowian the same?
No, Markovian defines a class of stochastic processes (memoryless), whereas Poisson is just a particular instance of that class. In short, Poisson processes are Markovian, but not all Markovian processes are Poisson.
(Winter 2000) Poisson processes are counting processes. How would you implement that on a real computer system?
I would use the alternate definition of a Poisson process, that stands that the interarrival times are exponentially distributed.(Winter 2000) Are the two definitions (counting and interarrival times) strictly equivalent?
Yes.What's an RPC?
A Remote Procedure Call is a way of abstracting the procedure call of programs across process boundaries using network protocols. With this method, arguments are marshaled and sent to the remote system, which unbundles the incoming data and executes the requested procedure on behalf of the client.
What is marshaling?
Turning data structures in memory into a sequential data structure that can be sent over a network as parameters or return value of an RPC.
Last changed January 30, 1997. David Coppit, david@coppit.org