All quizzes (key)

Suppose a single-core system is running three processes, A and B and C. Then the following sequence of events occur in the following order:

Initially, process A is running
Process A starts waiting for input from the keyboard
Process B computes an image and displays it on the screen
Process B computes a second image and displays it on the screen
Process C runs part of its long computation
Process A retrieves its input from the keyboard
Process A causes Process B to be terminated (killed)
Process A starts waiting for keyboard input again
Process C performs the second part of its computation

Question 1 (1 points): (see above) During this sequence, what is the minimum number of process context switches that must occur? (Write your answer a a base-10 integer.)

Answer: (accepted answers: /4|6/)

A -> B (while A is waiting for input) -> C (computation) -> A (finish waiting, system call to kill B) -> C (more computation); note that B does not in gneneral need to run to be terminated, but accepted 6 in case you thought (as happens in xv6 and some other OSes) that B would be run in kernel mode briefly to handle running cleanup code; note that system calls need not always result in a context switch

Question 2 (1 points): (see above) In a Unix-like operating system (with a monolithic kernel design), which of the following operations (from the sequence above) most likely would require starting a system call?
Select all that apply

(correct answer)
Process B computes an image and displays it on the screen
Process C runs part of its long computation
the OS detects input for process A and runs process A
(correct answer)
Process A causes process B to be terminated (killed)

displaying something on the screen requires I/O; the input for process A likely finishes a system call, but process A likely requested the input with an system call started earlier (when it started waiting for input); killing another process requires a system call because it must manipulate data structures that only the OS is allowed to manipulate. I noticed after releasing this quiz that I had edited "the OS detects ..." to not match the phrasing above; this was a mistake, but I don't think changes which answers are correct.

Question 3 (1 points): Suppose an xv6 process has an invalid value for its stack pointer register while it is running in user mode. What are expected consequence of this?
Select all that apply

when an exception (such as a timer interrupt) next occurs, the xv6 kernel will crash while trying to save the user registers to the stack pointed to by that invalid pointer
(correct answer)
if the process is running in user mode and tries to access memory around that where the stack pointer points, the xv6 kernel will be run and kill the application
if the xv6 kernel context switches away from that process, it will not be able to context switch back to it because of the invalid stack pointer

the invalid stack pointer will only trigger an exception if something tries to use it; the xv6 kernel won't try to use it itself (except as part of something like retreiving system call arguments); xv6 won't try to use it to save user registers because it uses the kernel stack pointer for that, which is not the one that is active in user mode

Question 4 (1 points): When an exception occurs, processors will usually _____. (Include all types of exceptions, including what some sources call interrupts, traps, faults, etc.)
Select all that apply

wait for the currently executing function to finish executing before running the exception handler
(correct answer)
use a lookup table to determine the location of the exception handler based on the kind of exception
(correct answer)
switch from user to kernel mode (if the exception occured while the processor was in user mode)

the exception can't wait for the current function to finish because that would prevent things like timer interrupts from reliably running the OS. The lookup table is what's called the interrupt vector table on x86, which we saw being set in xv6 code snippets.

Question 1 (1 points): Consider the following C code that uses the POSIX API:

write(STDOUT_FILENO, "A", 1);
pid_t pid = fork();
write(STDOUT_FILENO, "B", 1);
if (pid == 0) {
    write(STDOUT_FILENO, "C", 1);
    exit(0);
} else if (pid > 0) {
    write(STDOUT_FILENO, "D", 1);
    waitpid(pid, NULL, 0);
    write(STDOUT_FILENO, "E", 1);
}

Ignore any minor syntax errors above, assume all appropriate headers are included and that none of fork(), write(), and waitpid() fail.

Two possible outputs of the above code are "ABDBCE" and "ABCBDE". Write another.

Answer: (accepted answers: /ABBCDE|ABBDCE/)

Comments:

Grader reply: your comment is not actually true (no change in grade)

Question 2 (1 points): Immediately after a call to fork(), what is true?
Select all that apply

(correct answer)
the values of local variables (except perhaps the return value of fork()) in the parent and child process are the same
if the child process modifies a value stored on its heap, the parent can read the modified value
(correct answer)
the parent and child process have the same files open
the child process is most likely running code from a different executable than the parent process

Comments:

Grader reply: your comment does not change which answer is correct (no change in grade)

Question 3 (1 points): Consider the following Unix-style shell command

./A B > C

The following is some partial C code using the POSIX API that would execute that command:

const char *args[10] = {"./A", "B", NULL, NULL};
pid_t pid = fork();
if (pid < 0) {
    handleError();
} else if (pid > 0) {
    waitpid(pid, NULL, 0);
} else {

    ______________________

    ______________________

    execv("./A", args);
}

Ignore any minor syntax errors above, missing error handling, and assume all appropriate headers are included.

Which of the following would be best to place in the blanks above to complete the code? (Ignore minor syntax errors and missing error handling.)

args[2] = "C";
(correct answer)
int fd = open("C", O_WRONLY | O_CREAT | O_TRUNC);
dup2(fd, STDOUT_FILENO);
close(fd);
int fd = open("C", O_WRONLY | O_CREAT | O_TRUNC);
dup2(STDOUT_FILENO, fd);
close(STDOUT_FILENO);
int fd = open("C", O_WRONLY | O_CREAT | O_TRUNC);
dup2(fd, STDIN_FILENO);
close(fd);
int fd = open("C", O_RDONLY);
dup2(fd, STDIN_FILENO);
close(fd);
int fd = open("C", O_RDONLY);
dup2(fd, STDOUT_FILENO);
close(fd);

For the following questions, consider the following Unix-style shell pipeline:

./A | ./B | ./C

When this is run, the commands ./A, ./B, and ./C are run in parallel, the output of ./A becomes the input to ./B, and the output of ./B becomes the input to ./C.

Assume that the shell uses fork() to create new processes and pipe() to allow the new processses to receive other process's output as input.

Question 4 (1 points): (see above) When executing the above shell pipeline, how many times should the shell call fork() or an equivalent function? (Write your answer as a base-10 number like "15".)

Answer: (accepted answers: 3)

Question 5 (1 points): (see above) When executing the above shell pipeline, how many times should the shell call pipe() or an equivalent function? (Write your answer as a base-10 number like "15".)

Answer: (accepted answers: 2)

Question 1 (1 points): In a Unix-like system, which of the following are attributes of a thread as opposed to a process?
Select all that apply

whether file descriptor 1 (which represents standard output) is open
(correct answer)
the value of the register most commonly used to hold function return values
the value of a global variable
the location of the top of the heap

Question 2 (1 points): Which of the following changes to xv6's scheduler would be likely to lower overhead?
Select all that apply

(correct answer)
only calling yield() on every other timer interrupt instead of on every timer interrupt
(correct answer)
running the thread/process-selection code in the thread being switched away from instead of using a dedicated scheduler thread
scanning the process/thread list for the runnable thread that has been run least recently (and adding code to track when threads were last run) instead of selecting the next process/thread on the list

Question 3 (1 points): A first-come first-served scheduler will often be better than a round-robin scheduler at providing which of the following?
Select all that apply

(correct answer)
high portion of compute time used for application computations
low variance in the wait times experienced by each of the threads being scheduled

Consider the following schedule:

initially the three threads A, B, and C are not runnable
at time 0, thread A becomes runnable
thread A is run from time 0 to time 3 (i.e. for 3 units of time)
at time 2, thread B becomes runnable
at time 3, thread C becomes runnable
thread C is run from time 3 to time 5
thread B is run from time 5 to time 8
at time 8, thread B starts waiting for I/O
thread C is run from time 8 to time 10
at time 10, thread C starts waiting for I/O
thread A is run from time 10 to time 11
at time 11, thread A starts waiting for I/O

Question 4 (1 points): (see above) What is the total turnaround time experienced by thread B between time 0 and 11? (Write your answer as a base-10 number like "15".)

One source of confusion on this question was mixing up "waiting for I/O" with "wait time". In the context of CPU scheduling, "wait time" refers to time that a thread could be run on the CPU but is not — so time spent waiting for I/O would not be included.

Answer: (accepted answers: 6)

8 (becomes not runnable) - 2 (becomes runnable); I probably should have been super-explicit that I meant CPU turnaround time for all these schedules, though I hope no one would've guessed otherwise based on how I've talked about scheduling in lecture. (Typically when one talks about turnaround time, one talks about it regarding a particular scheduler. So if we were talking about scheduling when disk I/O happened, there would be a separate notion of turnaround time based on that. But I don't think it would make much sense to combine that with CPU turnaround time — and if one did want to compute that, we don't say why the threads are not runnable initially — maybe they're just waiting for a certain amount of time to pass or for a user to press enter to advance to the next prompt or similar rather than for something like disk I/O.)

Question 5 (1 points): (see above) What is the total turnaround time experienced by thread C between time 0 and 11? (Write your answer as a base-10 number like "15".)

Answer: (accepted answers: 7)

Question 6 (1 points): Without preemption, shortest job first does not minimize mean turnaround time. In fact, in some corner cases, it fails to even achieve the minimum mean turnaround time possible without preemption (and on a single core). One reason for this is that the shortest job first does not consider schedules that leave the processor idle in order to wait for a short job. Which of the following are examples of such scenarios?
Select all that apply

(correct answer)
thread A becomes runnable at time 0, requires 5 units of time; thread B at time 1 requiring 2 units of time
(accepted any answer)
thread A becomes runnable at time 0, requires 5 units of time; thread B at time 2 requiring 2 units of time
thread A becomes runnable at time 0, requires 5 units of time; thread B at time 1 requiring 4 units of time
thread A becomes runnable at time 0, requires 10 units of time; thread B at time 5 requiring 9 unit of time

The question was intended to ask if the schedule is a scenario where not running A in order to be able to run the later-arriving short job B immediately after it arrives would decrease mean turnaround time over the shortest-job-first scheduler which avoids leaving the CPU idle. In retrospect I should have been more explicit about the antecedent of "such scenarios", so I've also given credit for cases where a schedule with preemption would do better (assuming context switch times are not too significant) than the SJF schedule without preemption.

For A@0 5 units, B@ 1 2 units: mean turnaround time running A at time 0, B at time 5 is (5 [A]+6 [B])/2 versus running B at 1, A at 3 is (8 [A] + 2 [B])/2, which is smaller. Making a similar comparison for the other cases does not yield a lower time.

A common confusion on this question was the assumption that the shortest job first schedule would look into the future and delay running something if a shorter job was about to come along. For example, assuming in the first scenario above that B would run first because it's shortest, even though this would require not running A initially. This is not quite how shortest job first (or shortest remaining time first) works — it choses the shortest among those currently ready because otherwise (as mentioned in the question) this would involve leaving the processor idle when that was not necessary.

Suppose a system uses a (preemptive) multi-level feedback queue scheduler to decide when threads get access to a single core. This scheduler divides threads into five priorities:

priority 4 (highest), with a time slice of 1 millisecond
priority 3, with a time slice of 2 milliseconds
priority 2, with a time slice of 5 milliseconds
priority 1, with a time slice of 10 milliseconds
priority 0, with a time slice of 20 milliseconds

And the scheduler increases the priority of threads (if possible) whenever they are not preempted and do not use their entire time slice and decreases the priority of a thread whenever it uses its entire time slice and still requires more computation.

This system is running four threads, each of which have the following CPU usage patterns which are repeated indefinitely:

thread A attempts to perform computation indefinitely
thread B1 attempts to perform 1.5 milliseconds of computation to process mouse input every 20 to 30 milliseconds
thread B2 attempts to perform 1.3 milliseconds of computation to process mouse input every 20 to 30 milliseconds
thread C attempts to perform 8.5 milliseconds of computation, followed by waiting about 50 milliseconds to get data from the network

For the following questions, assume context switch and other scheduling overheads are neglibile and all threads have been running for a while.

Question 1 (1 points): (see above) Thread B1 will have priority

Question 2 (1 points): (see above) When thread C completes a disk I/O, it is possible that it will need to wait for other threads to perform computations before it will be allowed to start performing its 8.5 milliseconds of computation.

Which of the following is the maximum amount of time that it might need to wait to start its computation? If the true answer is on a boundary (e.g. 1.1ms) then we will accept either of the two answers that touch the boundary as correct (e.g. both "less than 1.1 ms" and "between 1.1 ms and 1.8 ms" for 1.1ms).

B1 and B2 will generally have higher priority than C. If both of them receive some mouse input just before C's disk I/O finishes, then they'll delay C by 1.5 + 1.3 ms = 2.8 ms.

Question 3 (1 points): An alternative to the multi-level feedback queue style of designs for approximating SRTF is to try to record the length of CPU bursts, and use the average of recent CPU burst times to make a scheduling decisions.

In order to perform the task of a recording the length of CPU bursts, an operating system can time how long a process runs each time it is scheduled. This will result in a sequence of timings, one for each time the thread is scheduled on a core. The length of each CPU burst (suitable for approximating SRTF) is equal to _____

the length of exactly one of the timings
the sum of these timings, starting from some point when the thread was preempted and going until it next was preempted
the sum of these timings, starting from some point when the thread was preempted and going until it next ceased to be runnable
(correct answer)
the sum of these timings, starting from some point when the thread became runnable and going until it next ceased to be runnable
the sum of these timings, starting from some point when the thread become runnable and going until it is next preempted

Question 4 (1 points): A scheduler that approximates SRTF, like a multi-level feedback queue design, will tend to favor interactive programs that perform short CPU bursts. Linux's Completely Fair Scheduler sometimes does so, but since it has a different scheduling goal, it often wil not. In which of the following scenarios is CFS likely to result in much worse performance for the interactive thread than an MLFQ-like scheduler that approximates SRTF?

(Assume every thread has the same weight in CFS. Ignore CFS features we did not discuss in class, like handling multiple cores.)
Select all that apply

running one interactive thread with short CPU bursts that, if running alone, would use very little CPU time and one very CPU-intensive thread that never does I/O
running one interactive thread with short CPU bursts that, if running alone, would use very little CPU time and one non-interactive thread with much longer CPU bursts that performs disk I/O frequently
(correct answer)
running one interactive thread with frequent short CPU bursts that, if running alone, would use most of the available CPU time, and one very CPU-intensive thread that never does I/O
(correct answer)
running one interactive thread with short CPU bursts and a very large number of CPU-intensive threads that never do I/O

Question 5 (1 points): Consider the following C code that uses the POSIX threading API:

pthread_t a, b;
struct point { int x, y; };
void *bar(void *arg) {
    struct point *p = (struct point *) arg;
    printf("%d\n", p->x);
}
void *foo(void *arg) {
    struct point *initial = (struct point *) arg;
    struct point new;
    new.x = initial->x; new.y = initial->y;
    pthread_create(&b, NULL, bar, (void*) &new);
    pthread_join(b, NULL);
}
int main(void) {
    struct point *main_pt = malloc(sizeof(struct point));
    main_pt->x = 100; main_pt->y = 120;
    pthread_create(&a, NULL, foo, main_pt);
    pthread_join(a, NULL);
    free(main_pt);
}

Ignore any minor syntax errors above, assume all appropriate headers are included and that the pthread functions do not fail.

When this code is run p->x in bar most likely will access _____?

a value stored in the heap
(correct answer)
a value stored on thread a's stack
a value stored on thread b's stack
a value stored in a global variable
a value stored in a register in thread b
a value stored in a register in thread a

Consider a multicore system where each core has its own cache. Consistency between these caches is maintained using a shared bus and a cache coherency protocol where each cache block is either in a modified, shared, or invalid state like we discussed in lecture. The implementation of this protocol tries to maximize efficiency by:

keeping values cached whenever possible without risking inconsistency; and
not writing values to the memory bus when possible (for example, by only invalidating values rather than writing a new value to memory)

On this system, cache blocks are 256 bytes, so addresses 0x00 through 0xFF are in the one cache block, as are 0x100 through 0x1FF, 0x200 through 0x2FF, etc.

Suppose all cores initially have empty caches and perform single-byte reads and writes as follows, in this order:

core 1 reads from 0x4000
core 2 reads from 0x4020
core 1 writes to 0x5000
core 1 writes to 0x4000
core 2 reads from 0x4028
core 1 reads from 0x5008

Assume at no point are cache blocks evicted from the cache other than due to invalidations to ensure consistency with other cores.

Question 1 (1 points): (see above) What is the final state of the cache block containing 0x4020 on core 1?

modified
(correct answer)
shared
invalid

Question 2 (1 points): (see above) What is the final state of the cache block containing 0x5000 on core 1?

(correct answer)
modified
shared
invalid

Question 3 (1 points): Suppose two threads A and B coordinate using a spinlock implemented based on atomic exchange like we dicussed in lecture. Assume, unlike xv6's spinlocks, these threads do not disable interrupts when acquiring the spinlock.

Both threads are running on a single-core system that uses a round-robin scheduler with a 20 millisecond time quantum. Thread A acquires the spinlock, then needs to do about 50 milliseconds of computation, then will release the spinlock and terminate. About 1 millisecond after thread A acquires the spinlock, thread B tries and waits to acquire the spinlock. Assuming thread A and B are the only runnable threads on the system (and never need to wait for I/O), how long will it be between when thread A acquires the spinlock and when it releases it?

about 50 milliseconds
about 70 milliseconds
(half-credit)
about 90 milliseconds
(half-credit)
about 100 milliseconds
(correct answer)
about 110 milliseconds
about 130 milliseconds
about 150 milliseonds
A will never release the spinlock

A runs for 1 ms; B runs for 20 ms (trying and failing to acquire the spinlock; total 21 ms); A runs for 20 ms (total A 21 ms; total 41 ms); B runs for 20 ms (trying and failing to acquire spinlock; total 61 ms); A runs for 20 ms (total A 41 ms; total 81 ms); B runs for 20 ms (total A 41 ms; total 101 ms); A runs for 9 ms (total A 50 ms; total 110 ms)

Question 4 (1 points): In lecture, we discussed implementing a mutex that does not consume a lot of CPU running a waiting loop by using a queue of waiting threads and a spinlock. This mutex implementation may _____.
Select all that apply

(correct answer)
make a previously runnable thread not runnable from the mutex lock operation
return from the mutex lock operation after locking and not unlocking the spinlock
(correct answer)
need to wait briefly to acquire the spinlock (by running a loop that repeatedly checks if the spinlock is free) in order to complete the mutex unlock operation

Question 5 (1 points): Consider the following incomplete C code that uses the POSIX threading API:

pthread_mutex_t lock;
pthread_cond_t cv; // condition variable
int values[2];
int have_values[2] = {0, 0};

void SetValue(int index, int new_value) {
    pthread_mutex_lock(&lock);
    values[index] = new_value;
    have_values[index] = 1;

    if (___________________________________) {
        pthread_cond_broadcast(&cv);
    }
    pthread_mutex_unlock(&lock);
}

void WaitForAndGetBothValues(int *values_copy) {
    pthread_mutex_lock(&lock);
    while (!have_values[0] || !have_values[1]) {
        pthread_cond_wait(&cv, &lock);
    }
    values_copy[0] = values[0];
    values_copy[1] = values[1];
    pthread_mutex_unlock(&lock);
}

Ignore any minor syntax errors above and assume all condition variables, locks, etc. are appropriately initialized before any of the above code is run.

The WaitForAndGetBothValues function is intended to wait for SetValue to be called with both index 0 and index 1, then return the resulting contents of the values array (by copying it into an array located using the supplied pointer). Which of the following would be best to place in the blank above to make WaitForAndGetBothValues behave correctly?

Question 1 (1 points): Consider the following code that uses semaphores:

sem_t a, b;
int x = 0;

void foo() {
    sem_wait(&a);
    write(STDOUT_FILENO, "A");
    x = x + 1;
    sem_post(&a);
    write(STDOUT_FILENO, "B");
    sem_wait(&b);
    write(STDOUT_FILENO, "C");
}

void bar() {
    sem_wait(&a);
    while (x > 1) {
        write(STDOUT_FILENO, "D");
        sem_post(&b);
        x = x - 1;
        write(STDOUT_FILENO, "E");
    }
    sem_post(&a);
}

Suppose the a semaphore is initialized to 1 and the b are semaphores is initialized to 0 by code not shown, and no code other than the initialization code and the code above uses these semaphores or changes the value x. Which of the following are possible outputs from multiple threads calling foo() and bar() multiple times?
Select all that apply

As some pointed out on Piazza, I meant to write while (x > 0) rather than while (x > 1) above. We've accepted both the answers with this correction and without. Sorry for the confusion.

Suppose we have a multithreaded system for handling a ticket vending service. When dealing with the last few remaining tickets, this service attempts to offer each ticket to a user and wait until they complete the purchase of the ticket or decline to purchase it. While this is happening, another user can be waiting for a ticket to become available or for all tickets to be sold out.

Suppose, to implement this, the service provides the following functions:

bool WaitForNextTicket() --- waits for a ticket to be available for purchase or for all tickets to be sold out. Returns true if a ticket is still available for purchase, in which case it is reserved until BuyTicket() or DeclineTicket() is called. Returns false is tickets are sold out.
void BuyTicket() --- purchases a ticket
void DeclineTicket() --- declines to purchase a ticket

so the code that prompts a user for their decision looks something like:

if (WaitForNextTicket()) {
    printf("Purchase ticket? ");
    if (ReadInputFromUser() == YES) {
        BuyTicket();
    } else {
        DeclineTicket();
    } 
} else {
    printf("Tickets are sold out\n");
}

Question 2 (1 points): (see above) Suppose these functions are implemented using mutexes and condition variables. The WaitForNextTicket() function would most likely call pthread_cond_wait ____________.

in a loop for as long as there are unsold tickets
(correct answer)
in a loop for as long as all unsold tickets are awaiting a user's decision to purchase or decline it
in a loop for as long as at least one ticket is awaiting a user's decision to purchase or decline it
once (without a loop), but only if all unsold tickets are awaiting a user's decision to purchase or decline it
once (without a loop), but only if at least one ticket is awaiting a user's decision to purchase or decline it
once (without a loop), but only if there are unsold tickets

Question 3 (1 points): (see above) Suppose these functions are implemented using POSIX semaphores. Of the choices listed below, it is most likely that the WaitForNextTicket() function would call sem_wait on a semaphore whose value represents ______.

(correct answer)
the total number of tickets available for purchase and not reserved
the total number of tickets either available for purchase or currently reserved for purchase (but not bought or declined)
the total number of tickets currently reserved for purchase (but not bought or declined)
the total number of tickets ever reserved for purchased, regardless of whether eventually bought or declined
the total number of tickets already bought

Consider a multithreaded system for processing college applications that stores most of its data in memory. It is written in C++ and uses the following structs:

struct Student {
    pthread_mutex_t lock;
    string name;
    Application* current_application;
    ...
};

struct Application {
    pthread_mutex_t lock;
    Student *student;
    AcademicYear *year;
    string status; /* one of pending, accepted, admitted, rejected, withdrawn */
    ...
};

struct AcademicYear {
    pthread_mutex_t lock;
    set<Application*> pending, admitted, rejected, withdrawn;
};

and has the following functions relevant to the questions below:

bool WithdrawPendingStudent(Student *student) {
    pthread_mutex_lock(&student->lock);
    Application *app = student->current_application;
    pthread_mutex_lock(&app->lock);
    bool result;
    if (app->status == "pending") {
        pthread_mutex_lock(&app->year);
        app->status = "withdraw";
        app->year->pending.erase(app);
        app->year->withdrawn.erase(app);
        pthread_mutex_unlock(&app->year);
        result = true;
    } else {
        result = false;
    }
    pthread_mutex_unlock(&app->lock);
    pthread_mutex_unlock(&student->lock);
    return result;
}

bool AdmitFirstKPending(AcademicYear *year, int K) {
    pthread_mutex_lock(&year->lock);
    for (int i = 0; i < K; ++i) {
        Application *app = FindHighestRanked(&year->pending);
        year->admitted.insert(app);
        year->pending.erase(app);
        pthread_mutex_lock(&app->lock);
        app->status = "admitted";
        pthread_mutex_unlock(&app->lock);
    }
    pthread_mutex_unlock(&year->lock);
}

(Please ignore minor syntax errors, etc. in the above, assume all appropriate headers are included, etc.)

Question 4 (1 points): (see above) Two simultaneous calls to WithdrawPendingStudent() and AdmitFirstKPending() sometimes deadlock. When this occur, the thread calling WithdrawPendingStudent() can be holding the lock in a _____ while the thread calling AdmitFirstKPending() is trying to acquire the same lock.
Select all that apply

AcademicYear struct
(correct answer)
Application struct
Student struct

Question 5 (1 points): (see above) Two simulatenous calls to WithdrawPendingStudent() and AdmitFirstKPending() sometimes deadlock. When this occur, the thread calling AdmitFirstKPending() can be holding the lock in a _____ while the thread calling WithdrawPendingStudent() is trying to acquire the same lock.

(correct answer)
AcademicYear struct
Application struct
Student struct

Suppose an x86 system which uses the same page table layout as xv6 (two-level page tables, 4KB pages, 32-bit virtal addresses, 4B page table entries) where:

the page table base pointer value is 0xAC000 (a byte address ,not a page number).
the top-level page table ("page directory") entry corresponding to virtual address 0x4443 contains page number 0x123 and is marked as present
the second-level page table entry corresponding to virtual address 0x4443 contains page number 0x456 and is marked as present

Question 1 (1 points): (see above) The first-level page table ("page directory") entry for virtual address 0x4443 is at what physical address? Write your answer as a hexadecimal number like 0xAC888; if not enough information is provided write unknown.

Answer: (accepted answers: /(?:0[xX])?0*[aA][cC]000/)

Question 2 (1 points): (see above) The second-level page table entry for virtual address 0x4443 is at what physical address? Write your answer as a hexadecimal number like 0xAC888; if not enough information is provided write unknown.

Answer: (accepted answers: /(?:0[xX])?0*123010/)

Question 3 (1 points): (see above) The byte stored at virtual address 0x4443 is located at what physical address? Write your answer as a hexadecimal number like 0xAC888; if not enough information is provided write unknown.

Answer: (accepted answers: /(?:0[xX])?0*456443/)

Question 4 (1 points): Suppose wanted to make a system call in xv6 with the following prototype:

int cross_process_copy(char *source_address, int dest_pid, char *dest_address)

which would copy one byte from virtual address source_address in the current process to virtual address dest_address in the destination process dest_pid. After retreiving the arguments for this system call, one potential implementation strategy would look like the following:

char *real_source_address = convert_source_address(source_address);
char *real_dest_address = convert_dest_address(dest_pid, dest_address);

*real_dest_address = *real_source_address;

To make this work, the convert_dest_address function should:

return dest_address without modification
return the sum of the virtual address of dest_pid's page table and dest_address
return the sum of the physical address of dest_pid's page table and dest_address
return the sum of dest_address and the start of the kernel virtual memory region (KERNBASE) (the operation performed by P2V(dest_address))
use dest_pid's page table to convert dest_address to a physical address, then return that physical address
use the current page table to convert dest_address to a physical address, then return that physical address
(correct answer)
use dest_pid's page table to convert dest_address to a physical address, then return the sum of that phyiscal address and the start of the kernel virtual memory region (KERNBASE) (the operation performed by P2V(that physical address))
use the current page table to convert dest_address to a physical address, then return the sum of that phyiscal address and the start of the kernel virtual memory region (KERNBASE) (the operation performed by P2V(that physical address))

Question 5 (1 points): Suppose an operating system does not allocate memory for programs or load a program's data from disk until it attempts to use each page of it.

Assume the operating system is using a page table layout like xv6 (two-level page tables, 4KB pages, 32-bit virtal addresses, 4B page table entries).

Suppose a program accesses all of an 8192-byte global array for the first time. When this occurs, page faults may occur to allocate memory for the array.

How many page faults are required may depend on what addresses were allocated to the array (for example, whether the array starts in the middle of a page) and whether memory around the array was already accessed previously.

What numbers of page faults are possible?

Consider the following C code:

int fd1 = open("one.data", O_RDONLY);
if (fd1 < 0) handle_error();
char *p = mmap(NULL, 8192, PROT_READ, MAP_SHARED, fd1, 0);
if (p == (char*) MAP_FAILED) handle_error();
int fd2 = open("one.data", O_RDWR);
if (fd2 < 0) handle_error();
char *q = mmap(NULL, 8192, PROT_READ | PROT_WRITE, MAP_SHARED, fd2, 0);
if (q == (char*) MAP_FAILED) handle_error();
*q = 'a';
printf("%c", *p);

Assume all required headers are included and ignore minor syntax errors above and assume open() and mmap() do not fail (so handle_error() is not reached).

Assume:

the pointers p and q contain different virtual addresses, and
that no data is swapped out to disk at any time while the code runs

Question 1 (1 points): (see above) Immediately after the code above runs, which of the following statements are true?
Select all that apply

(correct answer)
*p and *q contain the same value (regardless of the initial contents of one.data)
(correct answer)
the physical pages assigned to *p and *q are the same
(correct answer)
the virtual addresses assigned to *p and *q have the same page offset
(correct answer)
opening one.data and writing to it can change the value of *p

part about virtual address page offset was briefly miskeyed

Question 2 (1 points): (see above) Immediately after the code above runs, which of the following operations are likely to trigger a page or protection fault for a virtual address assigned to the file one.data?
Select all that apply

Question 3 (1 points): Suppose three processes A, B, and C are running. Processes A and B share the same executable foo.exe, and the code from the executable mapped into their address space. Process C is running with a different executable and tries to use a lot of space on its heap. In order to satisfy process C's requirement for more memory, the operating system chooses to use space currently used for part of the machine code in foo.exe. This space was loaded into memory since it was first used by process A. However, process B most recently accessed the machine code in question.

As part of allocating the data for C, the operating system should ______.
Select all that apply

(correct answer)
edit some of the page table entries for process A
(correct answer)
edit some of the page table entries for process B
(correct answer)
edit some of the page table entries for process C
write back changes to this space to foo.exe

Suppose a system implements virtual memory with 5 physical pages used to satisfy a program's memory needs. To keep things simple, the physical pages are only used for the program's memory and not to cache file data, etc.

It is running two processes A and B, and each of those processes is assigned three virtual pages: for process A, A1, A2 and A3; and for process B, B1, B2, and B3.

Process A executes the following access pattern:

A1, A2, A3, A1, A3, A2

and process B executes the following access pattern:

B1, B3, B1, B2, B3, B3

Question 1 (1 points): (see above) Suppose the two processes are run so their accesses are interleaved, so the overall access pattern is:

A1, B1, A2, B3, A3, B1, A1, B2, A3, B3, A2, B3

Assuming all physical pages are initially empty, with an LRU replacement policy how many page replacements will occur? (Loading data into an physical page, regarldess of whether it was empty or not, is a page replacement.) Write your answer as a base-10 number.

Answer: (accepted answers: 7)

A1(replace empty; A1), B1(replace empty; A1,B1), A2(replace empty; A1,B1,A2), B3(replace empty; A1,B1,A2,B3), A3(replace empty; A1,B1,A2,B3,A3), B1(hit; A1,A2,B3,A3,B1), A1(hit; A2,B3,A3,B1,A1), B2(replace A2; B3,A3,B1,A1,B2), A3(hit; B3,B1,A1,B2,A3), B3(hit; B1,A1,B2,A3,B3), A2(replace B1; A1,B2,A3,B3,A2), B3(hit)

Question 2 (1 points): (see above) Suppose the two processes are run one after the other, so the overall access pattern is:

A1, A2, A3, A1, A3, A2, B1, B3, B1, B2, B3, B3

Assuming all physical pages are initially empty, with an LRU replacement policy how many page replacements will occur? (Loading data into an physical page, regarldess of whether it was empty or not, is a page replacement.) Write your answer as a base-10 number.

Answer: (accepted answers: 6)

(load A1, A2, A3; load B1, B3, B2(replacing A1))

Question 3 (1 points): In lecture we discussed an approximation of LRU called SEQ that:

seperated physical pages into an ordered active list and an ordered inactive list
placed just-replaced physical pages on the top of the active list
pages on the active list are not monitored for accesses
when the inactive list gets too small, takes a page from the bottom of the active list and places on the inactive list
when a page on the inactive list is accessed, places that page on the top of the active list
selects physical pages to replace from the bottom of the inactive list

Which of the following is true about this scheme?
Select all that apply

decreasing the size of the inactive list is likely to increase the overhead of detecting accesses to pages
decreasing the size of the inactive list from many pages to one page is likely to make the scheme approximate LRU better
(correct answer)
increasing the size of the inactive list from about one tenth of all physical pages to all but a couple of physical pages is likely to make the scheme approximate LRU better (possibly at the cost of a higher overhead)

Question 4 (1 points): In some circumstances, readahead can dramatically increase the number of page cache misses (and therefore required amount of (re)loading data from disk). Suppose this is true even though the pages which are read ahead are, in fact, used before they are replaced.

Which of the follow are true about this scenario where readahead causes bad performance?
Select all that apply

if the operating system's readahead implementation loaded more pages at a time, this would likely decrease the number of page cache misses
(correct answer)
if the operating system's readahead implementation loaded fewer pages at a time, this would likely decrease the number of page cache misses
(correct answer)
a significant number of the virtual pages replaced as part of the readahead were probably accessed before the readahead pages

Question 5 (1 points): Suppose a process takes input from the keyboard on a xv6-like operating system. As part of obtaining input from the keyboard controller, the operating system needs to use a control register on the device controller to determine whether there is input. In which of the following scenarios is code that accesses the keyboard controller's control registers likely to run?
Select all that apply

(correct answer)
from an device interrupt handler that interrupts the process that requires keyboard input
(correct answer)
from an device interrupt handler that interrupts a process that does not require keyboard input
from the read system call implementation (and not during a interrupt triggered by the device, regardless of whether that can interrupt a system call)

Question 1 (1 points): Suppose a network interface device uses direct memory access (DMA). It provides the following registers, each of which has corresponding memory address through which the operating system can set them:

a control register which is 1 if receiving input is enabled, 0 otherwise
a register indicating the current size of the input buffer
a register indicating the address of the input buffer
a register indicating the amount of the input buffer which is currently used for data
and some additional registers not listed

(Assume no IOMMU is in use. If you do not know what an IOMMU is, then it won't affect your answer.)

In order receive input from the network using this device, it would be best for the operating system to set the register indicating the address of the input buffer to:

a kernel virtual address corresponding to the operating system's buffer for network data
(correct answer)
a physical address corresponding to the operating system's buffer for network data
a user virtual address corresponding to the current read() call trying to read network input
the address of the exception handler that handles input from this device

For these questions, consider a FAT-like filesystem with:

1024 byte clusters;
4 byte FAT entries; and
32 byte directory entries

Question 2 (1 points): (see above) Ignoring space in the FAT itself and the header (BPB), how many clusters would be used (completely or partially) to store a FAT filesystem whose root directory contains 100 directories and each of those directories contains one 1000-byte text file? Write your answer as a base-10 number of clusters (like "500")

Answer: (accepted answers: 204)

3200 bytes for directory entries -> 4 clusters for root directory + 100 * (1 cluster for each direcotry + 1 cluster for each contained text file)

Question 3 (1 points): (see above) In the scenario described in the previous question, how many FAT entries would contain a value representing a cluster number? Write your answer as a base-10 number of clusters (like "500").

Answer: (accepted answers: 3)

so no end-of-file/cluster-chain entries or free markers; leaving only the ones in the root directory, and not the last one

Question 4 (1 points): Consider the following strategies for assigning clusters to a large file on a FAT filesystem. Assume the operating system assigning clusters knows the number of clusters that must be assigned in advance and that this operating system always keeps a copy of the file allocation table in memory for performance.

Which of the following strategies would likely result in best performance when reading the file from a hard drive?

choose the available clusters with the lowest numbers (closest on the disk to the location of the FAT)
choose the available clusters with the highest numbers (furthest on the disk to the location of the FAT)
(correct answer)
choose a set of available clusters that are contiguously numbered
choose a set of clusters that use as many cylinders as possible
choose a set of available clusters that are as close in number to the cluster containing the directroy entry for the file, even if this means having large gaps between the chosen cluster numbers

Consider an xv6-like filesystem with:

32 byte directory entries
1024-byte blocks
4-byte block pointers
10 direct block pointers in each inode
1 indirect block pointer in each inode
1 double-indirect block pointer in each inode

Question 5 (1 points): (see above) To store a directory with two 99.5KB files (where KB = 1024 bytes), how many blocks are used (outside of inodes, free block maps, etc.)? Include any blocks allocated to hold block pointers (outside of inodes) and space required for the directory.

Write your answer as a base-10 number (like "500").

Answer: (accepted answers: 203)

100 blocks for each file, plus 1 indirect block for each file, plus 1 block for directory entries

Suppose a inode-based filesystem uses redo-logging to recover from failures.

As part of truncating a file from a large size to a smaller size it needs to write:

A: the inode for the file
B: a log entry corresponding to the inode for the file to specify a new file size
C: the free block map to mark as free blocks that previously held parts of the file's data
D: log entries corresponding to the free block map that previously held parts of the file data
E: the free block map to mark as free blocks that previously held indirect blocks for the file
F: log entries corresponding to the free block map that previously held parts of indirect blocks for the file
G: a log entry indicating that a transaction is committed

The filesystem performs this truncation operation atomically, meaning that on a failure, after reboot, either the file will be truncated or will have its original size.

Question 1 (1 points): (see above) What is a valid order for the filesystem to perform the above operations during normal operation?

Write your answer as a sequence of letters. For example, if you think it would be valid for each of the operations above occur exactly once and in the order written above, write "ABCDEFG".

Answer: (accepted answers: /[BDF]{3}G[ACE]{3,}/)

Question 2 (1 points): We discussed the idea of "block groups" (like implemented in the Berkeley Fast File System) as a mechanism for keeping related data and metadata physically close on a hard drive.

In the Berkeley FFS scheme, each directory and the files it contains are assigned to a block group. This makes some assumptions about the typical access patterns for files.

For which of the following access patterns are many long seeks likely to be required (assuming data is not cached) when using the block group scheme we described in lecture?
Select all that apply

(correct answer)
reading and writing files in many subdirectories, where each subdirectory contains exactly one file, of a single directory
loading an application from /usr/bin along with many small libraries which are all contained in /usr/lib.
reading C source files and writing object files in a directory checked out from a version control system as part of compiling a program

Consider an inode-based filesystem that supports fragments for small files.

In this filesystem, each inode contains (among other fields): * a boolean which is true if the first direct block represent an extent and not a normal block * 12 direct block pointers * 1 indirect block pointer * 1 double-indirect block pointer * 1 triple-indirect block pointer

The filesystem uses 4KB blocks with 1KB fragments and directory entries that take 64 bytes.

Assume that the filesystem only uses fragments for files that fit within a single fragment.

For this question, 1KB = 1024 bytes and 1MB = 1024KB.

Question 3 (1 points): (see above) Suppose the filesystem is storing a directory with one thousand files where each of those files is 768 bytes. How many blocks will the filesystem use outside of inodes, free block maps, superblocks, etc.? Include space required for file data, directory entries and blocks of block pointers (outside of inodes). Write your answer as a base-10 number. (Assume the filesystem uses fragments in the most efficient way possible.)

Answer: (accepted answers: /26[67]/)

1000 fragments for files = 250 blocks, plus at least 64000 bytes (= 16 blocks) for directory) plus 1 indirect pointer for directory; also gave credit for 266 because of the better choice of the directory using an extent

Question 4 (1 points): (see above) Suppose the filesystem is storing a directory with 10 subdirectories, where each subdirectory contains a 10MB file. How many blocks will the filesystem use outside of inodes, free block maps, superblocks, etc.? Include space required for file data, directory entries and blocks of block pointers (outside of inodes). Write your answer as a base-10 number. (Assume the filesystem uses fragments in the most efficient way possible.)

Answer: (accepted answers: /.*/)

Giving everyone credit because you definitely need to know the size of block pointers. The intended question: If 4 byte block pointers: If 10MB file = 2560 data blocks + 1 indirect block pointed to by inode (for blocks 12-1024+12) + 1 double-indirect block pointed to by inode (for blocks >1024+12) + 2 indirect block pointed to by double-indirect block = 2564; main directory and each subdirectory use one fragment, so 11 fragments in total, which fits in 3 blocks, so 2564 * 10 + 3 = ... [originally counted number of indirect blocks used in double-indirect incorrectly]; Also I see in retrospect that I allowed for the possibility of extents, which is a better possible answer.

Question 5 (1 points): Suppose one uses the client/server communication model we discussed in lecture in system with two clients A and Band one server S.

Suppose we want implement a way to send a message from a client A to another client B. To be most consistent with the client/server model, this would be implemented by ______.

(correct answer)
having client A connect to the server and send its message and then having client B connect to the server and ask for the message and have the server reply with that message
having client A become the server and then having client B connect to the client A to receive the message
having client B become the server and then having client A connect to the client B to receive the message
having client A connect to the server and send its message, then have the server contact client B and forward the message

Suppose we wanted to allow someone to read files remotely using RPC. To do this, we take the simple solution of exposing the following POSIX file I/O calls as RPC calls:

int open(const char *path, int flags)
int read(int fd, char* buffer, size_t size)
int close(int fd)

Question 1 (1 points): (see above) To convert the read() call above into an RPC call, the resuling RPC call would most likely
Select all that apply

send the address contained in the pointer buffer to the server when the call is made (before invoking the server's implementation of read)
send the contents of buffer from the client to the server when the call is made (before invoking the server's implementation of read)
(correct answer)
send the contents of buffer from the server to the client when the call is made (after invoking the server's implementation of read)

Question 2 (1 points): (see above) Which of the following statements about how this scheme deals with failures are true?
Select all that apply

if the server can't open a file, then it will appear to the client like the file exists but is empty instead
(correct answer)
if the server crashes without saving information about file descriptors, then a client that is in the middle of reading a file will not be able to continue when the server is restarted
(correct answer)
if a client crashes without saving information about file descriptors, then the server is unlikely to have enough information to free resources for file descriptors
if a client crashes while reading a file and then tries to open and read the file again after rebooting, it may sometimes read the wrong file contents (during the second read, after rebooting)

Question 3 (1 points): Suppose an RPC system attempts to deal with errors by making sure that every message sending an RPC call or the return value of an RPC call receives an acknowledgment. If it does not, the RPC system retries sending that message until either it receives that acknowledgment or a certain amount of time has passed (in which case it signals an error to the calling program, if possible). In this scheme, if a client does not receive the return value of an RPC call after a certain amount of time, it also signals an error to the calling program.

Suppose network failures involving messages being lost or reoredered can occur and machine failures can occur according to fail-stop model. Which of the following are possible?
Select all that apply

(correct answer)
sending from a client an RPC that changes something on the server but having it appear to fail on the client
sending from a client an RPC that fails to reach the server but having it appear to work on the client
(correct answer)
having an RPC appear to be processed correctly on the server but fail on the client
(correct answer)
having an RPC appear from the server not to return the result to the client but actually return the result on the client

Question 4 (1 points): Consider a system using two-phase commit that distributes a database across two worker machines A and B and uses a single coordinator machine C.

Suppose after preparing a transaction with machines A and B, machine C receives a vote to commit from machine A and machine B. Then, before machine C can act on that result, machine A goes down for a long time.

When machine A comes back up, what would be most appropriate for it to do first?

make the changes that are part of the transaction that it previously voted to commit, then tell machine C that it did so
(correct answer)
resend its to vote to commit to machine C and wait for a reply
send a message to machine B saying that the transaction is committed, then commit the transaction
send a vote to abort to machine C

Question 1 (1 points): Suppose we want to provide the ability to run commands on a remote server using a stateless server. With support from servers, which of the following features could we provide that would be consistent with the stateless design?
Select all that apply

(correct answer)
the ability to run two commands at a time
a "change directory" operation that affects which directory the next command from a particular user runs in, regardless of what client machine they connect from
(correct answer)
a "change directory" operation that affects which directory the next command from a particular user runs in, but only if they use the same client machine to run that next command

Suppose an RPC call takes 1000 microseconds to complete plus an additional microsecond for every kilobyte of data transferred in the RPC call plus the time requires to run the corresponding RPC function on the server.

Question 2 (1 points): (see above) Suppose this RPC system is used by NFSv2. A client application to open a 2048KB file and read the last 1024KB of it 128KB at time and then closes it. Assume the client does not perform any caching of file data. The client already has the file ID for the directory, but is missing one for the file itself. Ignoring the time required to run the RPC functions on the server that lookup file data (and so just including the RPC overhead), about how much time is required for this operation in microseconds?

(Assume the client does not need to check the attributes of the file as part of opening it.)

Answer: (accepted answers: 10024)

1 LOOKUP of 1000 microseconds + 8 READs of 1000 + 128 microseconds each (was initially miskeyed because I misdivided to get 10 reads; and then miskeyed because I multiplied wrong)

Question 3 (1 points): When programs run with special priviliges on behalf of other users (like set-user-ID programs), a major concern is time-to-check-to-time-of-use vulnerabilities. One common example of these is when a program checks if the user on whose behalf if it is running is allowed to access some file specified via a path, and then, later, it actually access the file, without verifying that the path referred to the same resource that was checked earlier.

What are some examples of operating system support that could be added to POSIX-like systems that would be helpful for avoiding these problems? (In the options, file mode bits refers to the very limited access control list POSIX supports via chmod.)

(correct answer)
a new version of the open() system call that takes a user ID to check against access control lists and file mode bits (and uses that user ID in addition to the user ID the current process is running as)
(correct answer)
tracking different user IDs for filesystem accesses and other priviliged operations (like accessing devices) and allowing a set-user-ID program to be marked as only changing one of these
a system call for checking, given the path to a file, whether a particular user can access it

Consider a virtual machine implemented using trap-and-emulate without special hardware support for virtualization.

Suppose a program is running in the virtual machine and executes a system call to read input from the keyboard. The guest OS, while handling this system call, switches to another program and runs it in user mode. Eventually, the hypervisor detects keyboard input. As a result, it decides to simulate an interrupt for the keyboard. Then, the guess OS reads a control register from the keyboard controller. Then, the guest OS switches the original program and runs it in user-mode.

Question 4 (1 points): (see above) During the process described above, which of the above operations are likely involve in an exception (of any kind, including interrupts, traps, etc.) on the host machine (that is, an exception handled by the hypervisor)?
Select all that apply

(correct answer)
the program executing a system call
the hypervisor simulating the keyboard interrupt for the guest OS
(correct answer)
the guest OS switching to another program and running it in user-mode
(correct answer)
the guest OS reading a control register from the keyboard ocntroller

Question 5 (1 points): (see above) If the hypervisor is implemented using shadow page tables, which of the above operations are likely involve changing which shadow page table is active (or clearing/reconstructing a large portion of the shadow page table)?
Select all that apply

(correct answer)
the program executing a system call
(correct answer)
the hypervisor simulating the keyboard interrupt for the guest OS
(correct answer)
the guest OS switching to another program and running it in user-mode
the guest OS reading a control register from the [emulated] keyboard controller

originally miskeyed; the guest OS reading a control register doesn't require changing the shadow page table (though it involves running the hypervisor)

All quizzes (key)

Quiz 01

Quiz 02

Quiz 03

Quiz 04

Quiz 05

Quiz 06

Quiz 07

Quiz 08

Quiz 09

Quiz 10

Quiz 11

Quiz 12

Quiz 13