- flow of system call
    [user library code] (xv6: "usys.S")
        "system call wrapper" -- provides C functoin interface to syscalls
        contains the special instructoin to cause an exception
            recall: exception = enter kernel mode
        does any special argument setup --- OS's choice of calling convention for syscalls
            xv6: normal arguments (ones from C) passed the same as C does normally
                --> on the stack (x86-32)
            system call number passed through %eax
                have to pass this somehow --- no special parameters to "invoke syscall" instruction
                    (unlike a normal function call, where the function call instruction 
                    includes what to call)
    [generic trap handling code] (xv6: trapasm.S and traps.c)
        handles saving user registers so kernel code can do what it likes
            xv6: saves to kernel stack
        stores what type of exception happened ("system call") [because used for more than syscalls]
        and later looks that up to invoke correct handler
    [system call dispatch code] (xv6: syscall.c)
        uses system call number to invoke correct handler
    [particular system call implementatoin (xv6: sysproc.c/sysfile.c)
        retrieves arguments from user stack
        actually does requested operation
    
    returning from syscall goes through in reverse order

    - arguments (F2019 Q1)

- counting context switches
    a context switch happens when we need to run an application
    we don't need to context switch to:
        just make it so we can run an application later [***]
        clear out a process control block
        handle I/O --- exception handler runs in whatever program is currently running [***]
        etc.

    context switch != mode switch (mode switch: kernel versus user)
    `
- context switching and stacks
    in xv6 and many other OSes:
        saving user registers is handled when exceptions happen
            works the same way no matter whehter there's a context switch or not
            same idea as why we save registers for a functoin call
    
        when we decide to do a context swithc, we've already had an exception happen
            so we don't need to do anything extra to save user registers

        do need to save kernel information for actual switch operation
            (and restore it from the destination process)
        
        restoring the user informatoin is done using the same we'd return from exceptoin if no contxt switch
            same idea as returning from a fiunction call that saved registers

    in xv6:
        context switches pushed kernel registers onto kernel stack
            (and that's where kernel registers are saved when thread not running)
        entering the kernel for an exception pushes user registers on kernel stack
            (and that's where user registeers are saved when thread not running)


- fork: what code runs, what values are
    copies EVERYTHING
    after fork, there are two processes and nothing is shared between them
    
    int global = 200;
    int function() {
        int local = 100;
        int *local_pointer = malloc(1000);
        pid_t pid = fork();
        *** // use local, global, etc.
    }

    the code labelled *** will run in two processes (if fork() works --- enough memory, etc.):
        in the parent and in the child

    it will have separate copies of the global and local and local_pointer and *local_pointer
        so there is no way that what the parent does with global and local and local_pointer and *local_pointer
                 after forking will affect the child and vice-versa
    because we have address translation, the parent and child will have the same virtual addresses allocated
            and the same virtual addresses for everythig immediately after fork().
    -----
    same is true of set of file descriptors
        (though they'll still point to the same files/pipes/etc.)


- MLFQ: estimating how long things will run
    - quiz 4 Q 2: estimating how long until things start running

    * idea of MLFQ is that things with short-CPU bursts settle at a higher priority
        than things with a long CPU burst.
    --> back-of-envelope model is:
        * pretend everything has static priorities
        * look at how much time is left for program A after eveything with higher priority than A runs as much as it wants to
        * don't need to consider things at lower priority than A in figuring out how much A runs
    need to extra work if:
        * might have multiple things at same priority
        * priorities are fluctuating
        * it matters exactly when programs become runnable relative to each other

    --> non-back-of-the-envelope
        simulate what happens. Track priorities of everything and say "this runs for X sms", then ...

    * for things with flucuating priorities:
        * time quantum limits how long they can be higher priority
        * e.g. if program B can sometimes be at higher priority than A, it can only do so for
            (time quantum at the higher priorities) amount of time each time it gets to run


- monitors/mutexes versus semaphores    
    * monitors = mutex + CVs + data
        can accomplish the same thing that semaphors + data can do
    * semaphores combine critical section functionality (= mutual exclusion) and wait-for-event functoinality
        versus monitors: locks for crit. sect. and CVs for wait-for-event
    * semaphores have memory: remmeber a count of number of things that are allowed to wake up
        CVs: just deal with things that are already waiting
            waking something up on a CV now won't affect that start waiting later
    * common semaphore patterns:
        lock:
            semaphore with init value 1
            lock = wait/down
            unlock = post/up
        "gate": --- kinda like CVs
            semaphore with init value 0
            let a waiting thread go = post/up
            wait for someone to let me go = wait/down
        resource tracking:
            reserve resource (or wait for it to be available)= wait/down
            free up resource = post/up
    * common monitor patterns
        * lock when manipulating shared data
        * while (we can't do what we want do)
            cond_wait(some_CV, some_lock)
                NEED a while loop because:
                    spurious wakeups (cond_wait can return for "no" reason)
                    another thread could run in between when we stop waiting and when get the lock back
                        example: if we're trying to dequeue, the other thread could dequeue first
                            even though we were woken up to dequeue
                        consequence of not having Haore-scheduling 
                            signal does not "hand off" lock
        * if (a thread might be able to do what it wants to do)
            cond_signal/brodcast(some_CV)

- reader/writer locks
    - commonly it is okay to have multiple read something at the same time
        but only if no thread is writing/modifying it
    - reader/writer locks are an extension of locks that support this well
    - have two types of lock operations
        ReadLock: wait until there are no writers
        WriteLock: wait untl no readers or writers
    - considered a reader/writer when you hvae a Read/WriteLock and havne't unlocked

    - Q12 S2019: fill-in-the blank with monitors
        reader/writer lock where we have a maximum # of allowed readers
            AND with writer-priorirty
        new condition for ReadLock: 
            wait until there are no writers AND there are less than the max # of allowed readers
                AND no writers are waiting for th elock

        with monitors:
            ReadLock() {
                ....
                while (writers OR max # of readers 
                            OR writer waiting for the lock) {
                    cond_wait(reader_cv);
                }
            }
        ALSO need to make sure the CV is signalled when this new conditoin changes:
            for example:
                ReadUnlock() {
                    ...
                    if (no writer waiting for the lock) {
                        cond_signal(reader_cv) // because we decreased the # of readers
                    } else if (writers might be able to go) {
                        cond_signal(writer_cv)
                    }
                }

            Why only check no writer waiting?
                (1) if no thread is waiting to read, then cond_signal will notice that there's
                    no thread waiting and be very fast
                (2) it's possible for two threads to ReadUnlock() before any waiting thread finsihes ReadLock()
                                                            C: ReadLock()  D: REadLock() 
                                                            C: wait # read D: wait # read
                        A: ReadUnlock()         
                            signals C
                                         B: ReadUnlock() --> should singal here
                                            or else D won't wake up

                    rule of thumb: make sure you think about "what if the thing I signal doesn't run immediately"

- data races
    - writing a value while another thread might be reading/writing it
        - the result of read or write might varying depending on the exact timing
            (thread scheduling, how the compiler reorders things, how the processor reorders things)
    --> often (but not always) leads to a bug
        example: writing a map (typically red/black tree) while another thread is reading it
            the write might update a pointer and delete the original
            the read might use the deleted pointer

- deadlocks
    - circular waiting
        - "I'm waiting for someone who's waiting for me"
        - most common is I've locked a lock A, and am trying to get lock B
            while someone else locked lock B and is trying to get lock A
        ... but can have less direct circular dpeendnecies (e.g. more than two threads, not just locks)

    - not the only cause of threads hanging
        - not checking for a condition properly before pthread_cond_wait'ing
            can cause hanging but not a deadlock
            (not supposed to be waiting)
        - starvation
            not actually a circular dependency
            e.g. if you had a thead busy-waiting that always got to run instead of
                threads actually doing computation
        - running really slow --- not a deadlock since deadlock requires intervention to stop

- virtual memory: VA -> PA
    step one: split up the virtual address into pieces:

        [ Virtual address                                       ]
        [ virtual page number           ][ page offset          ]
                                          ^^^^^^^^^^^^^^^^^^^^^^---- big enough to have a unique # for each byte in page
                                                    e.g. 2^12 byte pages --> 12 bits
        [  VPN pt 1    ][   VPN pt 2   ]
         ^^^^^^^^^^^^^^------------------ index into the first-level page table
                         ^^^^^^^^^^^^^^-- index into the second-level page table

        size of VPN pt 1/2:
            big enough to have one value for each possible index
            e.g. if 2^10 entries in the page table --> 10 bits

    step two: lookup first -level
        page table base pointer --- beginning of array of 1st level PTEs
        acesss eleement that_array[VPN pt 1]
            address of that_array + VPN pt 1 * sizeof(PTE)

        PTE will contain:
            * valid bit --- is there a second-level page table
                (if no: then stop!)
            * other metadata
            * physical page number
                convert physical page number to a physical address by concatenating an all-zeroes page offset
                    (or by multiplying by page size)
                    [convert from pages to bytes]

    step three: lookup second-level
        same process as step two but:
            use physical address we got from the PTE in step two
            use VPN pt 2 instead of pt 1

    step four:
        take the physical address we got from step three,
        add the page offset from the original VA
        result: PA that will be accessed


- page faults: when do they happen
        - if we get to the "then stop!" above
        - other metadata in PTEs says "not writable/readable/etc." and we're trying to
            do a write/read/etc.
                    sometimes called a "protection fault" instead of "page fault"
    - copy-on-write
        - want to: copy data, but not right now
        - mark the data read-only
        - we will get a page fault when it's accessed
        - copy when that page fault happens
            gaurentee: before the data is modified [and we lose our chance to preserve the original value]

        - relies on page faults letting us change the page table and retry the operation

- CFS
    key idea: virtual time
        ~ scaled actual time a program has been run
    main goal: fairness in virtual time
        --> highest priority to run = lowest virtual time

    secondary goal (1):
        don't let things that have NOT been run for a very long time get a huge advantage
            example: if I walk away from my email client for 1hr, it shouldn't get ALL the cpu time when I come back
        solution: adjust virtual time with in a limit when things come back after waiting a while

    secondary goal (2):
        manage context switch overhead
        solution: don't switch to lowest virtual time so often that we're always context switching
            set minimum time quantum based on # of things running/parameters