- format of exam - generally similar to quizzes, but there will be a few more open-ended questions ~ longer questions probably similar in idea to prior finals midterms ("study materials" tab on course website) but only Spring 2020 had an open-book, open-notes format - aiming for it being doable in 3 hours ~ assuming about 5 minutes per quiz-like multiple-choice or short answer question - 24 hour period to complete the exam - topics that might be included on the exam - please refer to the schedule - exam grading - because there are some open-ended questions, it will take some time - what to memorize - open-book, open notes - but if you did something in an assignment, I think that's reasonable to ask about - system calls and context switches -- what triggers - system call: a process wants to the OS to do something for it and (typically) the process can't it do itself b/c it requires kernel mode for some rason > process deliberately runs an instruction that causes the OS's trap handler to run > often this done as part of a library function - context switches: the OS changes which thread is running on a core ~ this can be triggered because the current thread doesn't more work right now > e.g. it made a system call to read data that isn't available right now (and the system is supposed to wait until there is data) > e.g. the thread asked to exit ~ or can be triggered the scheduling policy says that some other thread is better to run the OS needs to arrange to run to detect this > e.g. exceeded time quantum -- would probably be detected from a timer interrupt handler > e.g. a higher priority thread is runnnable now because its input came in --- would probaly be detected from the interrupt handler for the device where that input came from - xv6: each process had exactly one thread so all context switches were thread and process context switches if we have more than one thread per process, then we can context switch from thread A in process 1 to thread B in process 1 or we can context switch from thread A in process 1 to thread C in process 2 > both are context switches > the latter is sometimes called a "process context switch" since we need to change some more information - Quiz 2, question 3 (swtch) struct context *first; struct context *second; int counter = 0; int foo(int a) { ... swtch(&first, ...); /* <---- */ counter += 2; cprintf("foo 1 (%d) ", counter); swtch(&first, second); counter += 2; cprintf("foo 2 (%d) ", counter); return a; } void bar() { swtch(&second, first); counter += 1; cprintf("bar 1 (%d)", counter); /* ******** */ } When "bar 1" is being printed out (assuming foo() has not yet returned), where might the value of foo's argument a be located? * observation 1: if "bar 1" is being printed out, bar() is currently using the active stack frame on the current stack. * observation 2: foo()'s argument a is kept in either: * some registers when foo() is running > when foo() stops running, either it's still in those registers or it's saved on the stack * on the stack of foo() Question: how did bar() start running? - looks like foo() can't have called bar() - so, bar() must have been run while foo() was still active via a call to swtch() Assuming it's a call to swtch() in foo(): What did that call to swtch() do? - saved all the registers of foo() on its stack - switches to a different stack Therefore: the argument of foo() must be on the stack that was active when swtch() was called and *not* on the stack that is active now when bar() is running It's a call to swtch() but in a trap handler: Why would trap handler call switch? Example: if there's a timer interrupt and the process/thread running foo() has exceeded its time quantum What happened to foo()'s stack+registers when that trap handler was started? - the trap handler used the same stack because foo() executed in the kernel [would not be true if foo() was in user mode] - runnign the trap handler saved the registers that were previously active in the trapframe - and the trapframe was stored in the same stack --> either way of running swtch(): the argument values are going to be on foo()'s stack three possible ways: - saved on the stack when foo() starts - was in a register, but saved on the stack when foo() called swtch() - was in a register, but saved on the stack when the trap handler started - barriers abstraction that provides a "have everyone sync up with each other" operation thread 1 thread 2 thread 3 does step 1, do step 1, do step 1, } for this to work, parts A-C must be safe to part A part B part C } to do at the same time barrier_wait() barrier_wait() barrier_wait() } threads 1-3 won't finish this "wait" every thread gets here does step 2, do step 2, do step 2, } commonly, step 2, part A require part A part B part C } step 1, part A AND step 1, part B to be done (and similar for other parts) example in life: need cell values from part A and B of the grid to compute next values for part A the barrier prevents running into trouble if one thread is a bit slow: do step 1, part A do step 1, ... do step 2, ... } unsafe/unwanted overlap part A part B } - locks lock an abstraction for ensuring that threads take turns doing something operations: lock()/acquire() -- wait until no one else is taking a turn unlock()/release() -- let someone else take a turn call the idea of taking turns accessing something "mutual exclusion" call code that is done with taking-turns restriction a "critical section" typical pattern: we have lock for each "resource" if resources A and B have separate locks, one thread can access resource A while another accesses resource B but no two threads can access resource A at the same time or access resource B at the same time - race conditions when we can have inconsistent/wrong behavior b/c of exactly when threads access a shared resource classic example: balance = balance + amount; if we took turns, we'd consistently get amount added to balance but when multiple threads do this, we can end up with inconsistent answers depending on exactly how they get scheduled (when the instructions that make up the operation take effect relative to each other) another example: if (!queue.empty()) { item = queue.dequeue() ... } if we don't take turns accessing queue, then we could call dequeue() when the queue is empty b/c someone else dequeued in between if we don't take turns accessing queue, then if the queue is implemented with an array and a counter and doesn't take precautions, two threads could dequeue the same item "race" --> if one thread is fast enough, something different happens - deadlock prevention by avoiding waiting simple example of deadlock: thread 1: thread 2: Lock (a) Lock(b) Lock (b) <-- wait Lock(a) <-- wait if instead of locking, let's we did: thread 1: thread 2: Lock (a) Lock (b) Lock or throw exception and unlock a if can't get a lock right away(b) Lock or throw exception and unlock b if can't get a lock right away(a) this avoids deadlock: we don't have infinite waiting this doesn't mean that it makes progress > would need to do some extra work to make sure that this throwing exception case doesn't happen indefinitely (a problem called "livelock") > could hope that eventually the timing will work out and could do some extra work to make it more likely work out most straightforward way to avoid waiting: other ways (not always applicable): * not having thread 1 and thread 2 and ... share any resources can also avoid "hold and wait" if we only lock 1 thing, we aren't holding another resource while we wait (even if we wait for that one thing) >> sufficient to prevent deadlock - devices and memory-mapped I/O and interrupts - devices have a device controller typically attached to the memory bus - when the processor tries to access memory, it sends a message ont he memory bus to do the read/write - typically, the actually memory responds - but, the device controllers can do this instead for memory locations chosen for them - the device controller hardware can be designed to whatever it wants to respond to this memory reads/writes - typically: device controller exposes "control registers" memory locations that * give information about the device's state * allow the OS to ask the device to do something (e.g. transfer data) device controllers can also expose buffers they control e.g. a memory location that contains the last keypress - interrupts: the memory bus gives a way for device controllers to tell the processor "I have something the OS might care about" the processor turns this into an interrupt --> runs the OS's trap handler typically device controllers use this: * to notify the OS of input it needs to handle * to nofity the OS that some outputting it started early is done * ... anything else the OS needs to respond to "the OS" --> typically the device driver component of the OS - device drivers - "bottom half": runs in response to interrupts > deals with cases where the device controller says "hey, OS, I have something you to deal with" example: data for you to send to applications somehow - "top half": runs in response to system calls (or similar) > deals with application requests to access a device example: write something to screen > might communicate with device > might store information/read information from data structures setup/used by the bottom half example: kernel buffer of keyboard input filled in by "bottom half" and copied to the application by the "top half" - multi-level page tables - hardware lookup: - special register page table base register contains the physical address of the 1st page table - given a N-bit virtual address X, the hardware: - lookup index X[bits N to (N-K)] in the first-level page table to get page table entry > K is log_2(nubmer of entries per page table) - it will check the flags of that page table entry - extract the physical page number Y from that page table entry - lookup index X[bits N-K to N-2K] in the second-level page table located in physical page Y - it will check the flags of that page table entry - extract the physical page number Z from that page table entry ... - lookup index X[bits (page offset + K) to page offset] in the ??th page table located in .... - it will check the flags - extract the physical page number A from that page table entry - access memory at offset X[bits page offset to 0] of physical page A and give that to program - 32-bit x86 as used by xv6: - first lookup is bits 31 to 20 inclusive (bit 31 is the most significant bit) - second lookup is bits 19 to 12 inclusive and gets the final physical page number - page offset is bits 0 to 11 inclusive - which addresses are physical which are virtual - all physical addresses except for the input virtual address - where are flags stored - the hardware checks the flags in each level - typically OSes only do interesting things with the flags at the last level (was true in xv6) - Quiz 8, Q3: accessing VA 0x1234 (on xv6-like 32-bit x86) Q: where is the second-level page table entry for this 0x1234: bits 31 to 20 --> 0 <-- index in the first-level table first-level PTE (page table entry) is at the beginning of the table information for question: "first entry in the first-level page table [...] had PPN (physical page number) 0x43 bits 19 to 12 --> 1 <-- index in the second-level table second-level PTE (page table entry) is the second one in the table (second b/c index 1 is after index 0, which is the start) page table entries were 4 bytes, so this 4 bytes after the beginning of table --> 4 bytes after the beginning of PPN 0x43 <-- beginning of PPN 0x43 is byte address 0x43000 (b/c pages are 0x1000 bytes) PA 0x43004 page offset ---> 0x234 - xv6 filesystem / inode-based filesystems - in inode-based filesystem most file metadata is stored in an inode array of inodes in a dedicated part of the disk inode includes pointers to the data for the file unlike FAT, we have arrays of pointers (in a tree structure for large files) directory entries contain a name and the index of an inode index = where int he array is the inode unlike FAT, where a lot of other metadata is in directory entries - relationship between using RPC in twophase and how RPC works in general twophase, we're running an RPC server for the workers (so we can ask them what their current value is and so the coordinator can ask them to do things) and for the coordinator (so we [e.g. testing code] can ask coordinator to do something) typically in a commercial idea: we'd setup an RPC server for each "service" we wanted to access remotely note that even though we might not think of the workers as the "main" server, b/c we want to ask them what their current value is, in the RPC model we needed to have them run a server - twophase - do worker replies need to be logged by coordinators? generally: does it need to be logged --> what should we do if we reboot after this happens? if we need to know that replies happened, then yes, better log it if we *don't log*, we might have to ask for the replies again this might be wasteful, but it's safe --> produce consistent answers because workers never change their minds if we do log, we might be able to avoid asking for replies again this might be more efficient, but an optimization over the alternative