- format of exam
    - generally similar to quizzes, but there will be a few more open-ended questions
        ~ longer questions probably similar in idea to prior finals midterms
                ("study materials" tab on course website)
            but only Spring 2020 had an open-book, open-notes format
    - aiming for it being doable in 3 hours
        ~ assuming about 5 minutes per quiz-like multiple-choice or short answer question
    - 24 hour period to complete the exam
- topics that might be included on the exam
    - please refer to the schedule
- exam grading
    - because there are some open-ended questions, it will take some time
- what to memorize
    - open-book, open notes
    - but if you did something in an assignment, I think that's reasonable to ask about

- system calls and context switches -- what triggers
    - system call: a process wants to the OS to do something for it
        and (typically) the process can't it do itself b/c it requires kernel mode for some rason
        > process deliberately runs an instruction that causes the OS's trap handler to run
        > often this done as part of a library function
    - context switches: the OS changes which thread is running on a core
        ~ this can be triggered because the current thread doesn't more work right now
            > e.g. it made a system call to read data that isn't available right now
                (and the system is supposed to wait until there is data)
            > e.g. the thread asked to exit
        ~ or can be triggered the scheduling policy says that some other thread is better to run
                the OS needs to arrange to run to detect this
            > e.g. exceeded time quantum -- would probably be detected from a timer interrupt handler
            > e.g. a higher priority thread is runnnable now because its input came in ---
                would probaly be detected from the interrupt handler for the device where that input came from
    - xv6: each process had exactly one thread
        so all context switches were thread and process context switches
            if we have more than one thread per process, then
                we can context switch from thread A in process 1 to thread B in process 1
                or we can context switch from thread A in process 1 to thread C in process 2
                    > both are context switches
                    > the latter is sometimes called a "process context switch" since we need to
                        change some more information

- Quiz 2, question 3 (swtch)
    struct context *first;
    struct context *second;
    int counter = 0;

    int foo(int a) {
        ...
        swtch(&first, ...);   /* <---- */
        counter += 2;
        cprintf("foo 1 (%d) ", counter);
        swtch(&first, second);
        counter += 2;
        cprintf("foo 2 (%d) ", counter);
        return a;
    }

    void bar() {
        swtch(&second, first);
        counter += 1;
        cprintf("bar 1 (%d)", counter); /* ******** */
    }

    When "bar 1" is being printed out (assuming foo() has not yet returned), where might the value of
        foo's argument a be located?

        * observation 1: if "bar 1" is being printed out, bar() is currently using the active stack frame
            on the current stack.
        * observation 2: foo()'s argument a is kept in either:
            * some registers when foo() is running
                > when foo() stops running, either it's still in those registers or it's saved on the stack
            * on the stack of foo()

        Question: how did bar() start running?
            - looks like foo() can't have called bar()
            - so, bar() must have been run while foo() was still active via a call to swtch()

        Assuming it's a call to swtch() in foo():
            What did that call to swtch() do?
                - saved all the registers of foo() on its stack
                - switches to a different stack

            Therefore: the argument of foo() must be on the stack that was active when swtch() was called
                and *not* on the stack that is active now when bar() is running

        It's a call to swtch() but in a trap handler:
            Why would trap handler call switch?
                Example: if there's a timer interrupt and the process/thread running foo() has exceeded
                    its time quantum

            What happened to foo()'s stack+registers when that trap handler was started?
                - the trap handler used the same stack because foo() executed in the kernel
                    [would not be true if foo() was in user mode]
                - runnign the trap handler saved the registers that were previously active in the trapframe
                    - and the trapframe was stored in the same stack

        --> either way of running swtch(): the argument values are going to be on foo()'s stack
            three possible ways:
                - saved on the stack when foo() starts
                - was in a register, but saved on the stack when foo() called swtch()
                - was in a register, but saved on the stack when the trap handler started


- barriers
    abstraction that provides a "have everyone sync up with each other" operation

    thread 1        thread 2            thread 3
    
    does step 1,    do step 1,          do step 1,      } for this to work, parts A-C must be safe to
        part A          part B              part C      }       to do at the same time
    
    barrier_wait()  barrier_wait()      barrier_wait()  } threads 1-3 won't finish this "wait"
                                                            every thread gets here

    does step 2,    do step 2,          do step 2,      } commonly, step 2, part A require
        part A          part B              part C      }       step 1, part A AND step 1, part B to be done
                                                                    (and similar for other parts)
                                                            example in life:
                                                                need cell values from part A and B of the grid
                                                                to compute next values for part A
    
    the barrier prevents running into trouble if one thread is a bit slow:
        
        do step 1,  
           part A       do step 1,
                         ...
        do step 2,       ...        } unsafe/unwanted overlap
           part A         part B    }

- locks
    lock an abstraction for ensuring that threads take turns doing something
    
        operations:
            lock()/acquire() -- wait until no one else is taking a turn
            unlock()/release() -- let someone else take a turn

    call the idea of taking turns accessing something "mutual exclusion"
    call code that is done with taking-turns restriction a "critical section"

    typical pattern: we have lock for each "resource"
        if resources A and B have separate locks, one thread can access resource A while another accesses resource B
                but no two threads can access resource A at the same time or access resource B at the same time

- race conditions
    when we can have inconsistent/wrong behavior b/c of exactly when threads access a shared
        resource
    classic example:
        balance = balance + amount; 
    if we took turns, we'd consistently get amount added to balance
       but when multiple threads do this, we can end up with inconsistent answers depending on exactly
       how they get scheduled
            (when the instructions that make up the operation take effect relative to each other)

    another example:
        if (!queue.empty()) {
            item = queue.dequeue()
            ...
        }
    if we don't take turns accessing queue, then we could call dequeue() when the queue is empty
        b/c someone else dequeued in between
    if we don't take turns accessing queue, then if the queue is implemented with an array and a counter
        and doesn't take precautions, two threads could dequeue the same item

    "race" --> if one thread is fast enough, something different happens

- deadlock prevention by avoiding waiting
    simple example of deadlock:
    thread 1:         thread 2:
        Lock (a)
                        Lock(b)
        Lock (b) <-- wait
                        Lock(a) <-- wait

    if instead of locking, let's we did:

    thread 1:                               thread 2: 
        Lock (a)                                
                                                Lock (b)
        Lock or throw exception and unlock a if
            can't get a lock right away(b)
                                                Lock or throw exception and unlock b if
                                                    can't get a lock right away(a)

    this avoids deadlock: we don't have infinite waiting
        this doesn't mean that it makes progress
            > would need to do some extra work to make sure that this throwing exception
                case doesn't happen indefinitely (a problem called "livelock")
            > could hope that eventually the timing will work out
                and could do some extra work to make it more likely work out

    most straightforward way to avoid waiting:
        other ways (not always applicable):
            * not having thread 1 and thread 2 and ... share any resources
    
    can also avoid "hold and wait"
        if we only lock 1 thing, we aren't holding another resource while we wait
                (even if we wait for that one thing)
            >> sufficient to prevent deadlock

- devices and memory-mapped I/O and interrupts
    - devices have a device controller typically attached to the memory bus
    - when the processor tries to access memory, it sends a message ont he memory bus to do the read/write
        - typically, the actually memory responds
        - but, the device controllers can do this instead for memory locations chosen for them
            - the device controller hardware can be designed to whatever it wants to respond
                to this memory reads/writes
    - typically:    
        device controller exposes "control registers"
            memory locations that
                * give information about the device's state
                * allow the OS to ask the device to do something (e.g. transfer data)
        device controllers can also expose buffers they control
            e.g. a memory location that contains the last keypress
    - interrupts:
        the memory bus gives a way for device controllers to tell the processor
            "I have something the OS might care about"
        the processor turns this into an interrupt --> runs the OS's trap handler
        typically device controllers use this:
            * to notify the OS of input it needs to handle
            * to nofity the OS that some outputting it started early is done
            * ... anything else the OS needs to respond to
        "the OS" --> typically the device driver component of the OS
- device drivers
    - "bottom half": runs in response to interrupts
        > deals with cases where the device controller says "hey, OS, I have something you to deal with"
            example: data for you to send to applications somehow
    - "top half": runs in response to system calls (or similar)
        > deals with application requests to access a device
            example: write something to screen
        > might communicate with device
        > might store information/read information from data structures setup/used by the bottom half
            example: kernel buffer of keyboard input filled in by "bottom half"
                and copied to the application by the "top half"


- multi-level page tables
    - hardware lookup:
        - special register page table base register
            contains the physical address of the 1st page table
        - given a N-bit virtual address X, the hardware:
            - lookup index X[bits N to (N-K)] in the first-level page table to get
                page table entry
                    > K is log_2(nubmer of entries per page table)
            - it will check the flags of that page table entry
            - extract the physical page number Y from that page table entry
            - lookup index X[bits N-K to N-2K] in the second-level page table located in physical page Y
            - it will check the flags of that page table entry
            - extract the physical page number Z from that page table entry
            ...
            - lookup index X[bits (page offset + K) to page offset] in the ??th page table located in ....
            - it will check the flags
            - extract the physical page number A from that page table entry
            - access memory at offset X[bits page offset to 0] of physical page A and give that to program  

        - 32-bit x86 as used by xv6:
            - first lookup is bits 31 to 20 inclusive (bit 31 is the most significant bit)
            - second lookup is bits 19 to 12 inclusive and gets the final physical page number
            - page offset is bits 0 to 11 inclusive
    - which addresses are physical which are virtual
        - all physical addresses except for the input virtual address
    - where are flags stored
        - the hardware checks the flags in each level
        - typically OSes only do interesting things with the flags at the last level
            (was true in xv6)

    - Quiz 8, Q3:
        accessing VA 0x1234 (on xv6-like 32-bit x86)
        Q: where is the second-level page table entry for this
        
        0x1234:
            bits 31 to 20 --> 0 <-- index in the first-level table
                first-level PTE (page table entry) is at the beginning of the table
                    information for question: "first entry in the first-level page table [...] 
                        had PPN (physical page number) 0x43
            bits 19 to 12 --> 1 <-- index in the second-level table
                second-level PTE (page table entry) is the second one in the table
                    (second b/c index 1 is after index 0, which is the start)
                page table entries were 4 bytes, so this 4 bytes after the beginning of table
                    --> 4 bytes after the beginning of PPN 0x43 <--
                        beginning of PPN 0x43 is byte address 0x43000 (b/c pages are 0x1000 bytes)
                        PA 0x43004
            page offset ---> 0x234

- xv6 filesystem / inode-based filesystems
    - in inode-based filesystem
        most file metadata is stored in an inode
            array of inodes in a dedicated part of the disk
        inode includes pointers to the data for the file
            unlike FAT, we have arrays of pointers (in a tree structure for large files)

        directory entries contain a name and the index of an inode
            index = where int he array is the inode
            unlike FAT, where a lot of other metadata is in directory entries

- relationship between using RPC in twophase and how RPC works in general
    twophase, we're running an RPC server for the workers
        (so we can ask them what their current value is
         and so the coordinator can ask them to do things)
    and for the coordinator
        (so we [e.g. testing code] can ask coordinator to do something)

    typically in a commercial idea: we'd setup an RPC server for each "service"
        we wanted to access remotely
    note that even though we might not think of the workers as the "main" server,
        b/c we want to ask them what their current value is, in the RPC
        model we needed to have them run a server

- twophase
    - do worker replies need to be logged by coordinators?
        generally: does it need to be logged --> what should we do if we reboot after this happens?
            if we need to know that replies happened, then yes, better log it

        if we *don't log*, we might have to ask for the replies again
                this might be wasteful, but it's safe --> produce consistent answers
                    because workers never change their minds
        if we do log, we might be able to avoid asking for replies again
                this might be more efficient, but an optimization over the alternative