- exam coverage for things in reading versus lecture (e.g. DHCP)
    - we intend the exam to only cover things that were in lecture or assignments

- things people struggle with on the exam
    - I'm bad at guessing
    - historically Spectre, VM stuff, synchronization

- system calls and keyboard input and displaying output
    - three entities involved: hardware versus OS versus the "user" program running
    - hardware usually runs just the current thing
        EXCEPT: sometimes the hardware needs to help switch to the OS [exception]
        - for keyboard input:
            the hardware will help switch to the OS when something else is running
                and the keyboard input needs to be handled
        - for screen output:
            probably don't need hardware to notify OS, but maybe to say when screen is ready
        - timers:
            the OS might want to set itself up to run later
    - typically, we don't allow "user" programs to directly talk to any 
        input/output device
        - any operation that involves talking to an input/output device must be done
            by the OS
        - the OS won't magically know what the program wants, so the program will
            make a system call to do that
        - sometimes, when the program makes a system call, the OS will already have
            what the program wants available
                example: keypresses happened early, and now the program wants to read them
                example: can put some on the screen right now
        - sometimes, when the program makes a system call, the OS won't have what 
            the program wants
                example: waiting for user to press some keys
            the OS can record that the program is waiting and possible run other things

- F2025 Q4e [number of bytes transferred by the cache]
    - writing 4 bytes to a cache block not already in the cache
        and the cache hsa a write-allocate, write-back policy
    - in this case: nothing to write back (the replaced block is not dirty)
    - and we need the cache to end up storing:
        the 4 bytes that were written PLUS the (block size - 4) bytes that make up the rest of the block
        - optimization: we can just read the part of the block we aren't overwriting

- how different protocols relate to each other
    - layering model:
        application layer stuff implemented using transport layer stuff
        transport layer stuff implemeneted using network layer stuff
        ...

    - sockets:
        give you the "stream" interface
        associated with IP address + port numbers
        API to use the transport layer
- OSI layers
    - OSI model has 7 layers
    - we arne't teaching the OSI model because it's not actually the Internet uses:
        - application (= OSI layer 7)
            ~ we have some meaning for messages/streams of data
            ~ example: HTTP --- requests for documents and their contents
            ~ example: SSH --- terminal sessions and commands and things to dispaly from them
        - transport (= OSI layer 4)
            ~ we want to direct messages ("segments" or "datagrams") to specific programs
            ~ we want to tie messages together into a stream of bytes [* with TCP, not UDP]
            ~ we want to send those messages reliably [* with TCP, not UDP]
        - network (= OSI layer 3)
            ~ we want to forward messages ("packets") between different local networks
        - link (= OSI layer 2)
            ~ divide those bits into messages ("frames") and figuring out who they're for
        - physical (= OSI layer 1)
            ~ put bits on a "medium" (radio signals, voltage on a wire, ...)

- network address translation
    - IPv4 (still most common) has too few addresses
    - so we want to share addresses between many machines
    - solution: break the layer model a bit:
        - within a "private" network, each machine has its own IPv4 address
        - outside the network, all these machines use the same (or a small set) of IPv4 addresses
        - but -- still need to tell which machine each things is for
        - we use port numbers to do this

                          /- all the machines on the private network are addresesd with private IP addresses
                         /     that will not work on the real iIntenret
                         v
    [ private network ] ---- (router) |--- ["real" Internet]
        example:                      | ^
        within an            <- LOCAL |  \-- all the machines on the private network are addressed with
        apartment                             the public IP address(es) assigned to the router
                                                (which there are fewer of than devices on the private network)

    - table: [private IP (local) address, private port number,
              public IP (local) address, public port number,
              remote IP address, remote port number]
        - when a packet comes from the private network,
            we lookup the (private IP, private port, remote IP, remote port)
            and edit the packet as it goes out to use the
            corresponding public (local) IP and public port
        - when a packet comes from the remote network
            we lookup the (public (local) IP, public (local) port,
                        remote IP, remote port)
            and edit the the packet as it comes in to use the
            corresponding private (local) IP and private (local) port
        - table is filled in as new connections are made

- certificates and digital signatures
    - digital signature:
        Sign(private key, message) = signature
        Verify(public key, message, signature) = 1 iff signature is "genuine"
        a valid (verifying) signature proves that someone with the private key
            produced it for this particular message
        typical usage: attest that something genuinely comes from some principal
    - certificate:
        message is "Jo's public key is X"
            - the public key could be for encryption or digital signatures
        and we have a signature from a "certificate authority" proving that
            the certificate authority attests to this information
        if we want to use a certificate, we have to trust the certificate
                authority (regardless of whether they're trustworthy)

- when is encryption/authentication "safe"
- S2024 final Q2 (secure channels)`
    asking about what a passive (eavesdropper) and active (machine-in-the-middle) attacker could do
    client -> server sends
        PE(server's public encryption key, command + passpharse + unique ID)
    server -> client sends
        unique ID + response, Sign(server's private signing key, unique ID + repsonse)
    questions were what attacker could do/learn:
        - key things: server sends back information unencrypted so an eavesdropper can learn
            evreyhting the server sends back
                - and this includes event names + dates +statuses in the question
        - if we could manipulate encrpyted data, then we might imagine changing the client's command
            - we accepted either answer on the exam because that's often not an asymmteric encryption problem

- monitors (signal/broadcast/mutex_lock)
    - toolkit for "general" synchronization
        we want threads to sometimes wait for arbitrary things
    - general pattern
        - have some shared data that includes everything that we need to determine whether to wait
        - we always acquire a lock when accessing that shared data
            (so we don't have to worry about it changing in between us using it for a decision
                and applying that decision)
        - then:
            Lock(common lock)
            while (we need to wait)
                cond_wait("condition variable", common lock)
                    [this releases the common lock while we wait]
            do something that we needed to possible wait for
            if (we changed whether other people need to wait)
                cond_broadcast to their condition variables
            Unlock(common lock)

        - "condition variable" represents a list of threads waiting for something
            (it doesn't actually know what the thing they're waiting for is;
             we have to be consistent when we cond_wait/cond_broadcast)
            - best practice: is one condition variable for each thing wemight want to wait for
                (but we don't have to do that)
            - need to double-check the condition after waiting because:
                - between when the broadcast happened and when we got the lock back
                    something may have changed whether we needed to wait:
                        - example: queue had an item on it, but another
                            thread took that item before we go the lock back
                - "spurious wakeups" -- in corner cases, cond_wait implementation
                    might wake up too early
        - we can signal instead of broadcast if we know that there's never more than
            one thread that might need to be woken up 
                - example: producer/consumer,
                    know that at most one thread needs to go everytime we produce an item
                    if we always signal, not waking up too few threads

- branch prediction methods
    - what we want you to know:
        - using recent history to predict next outcome
        - typically identify recent history based on addresses of branch instructions involved


- OOO: decode versus in-order pipeline decode
    - decode in in-order is two things:
        - actually interpreting the machine code
            [in the original in-order pipelined processors, this was very simple]
        - reading the registers
    - decode in the OOO pipeline:
        - actually interpreting the machine code

- OOO: where forwarding happens
    - Fetch / Decode / Rename --> [buffer] --> Issue / Execute... / Writeback --> [buffer] --> Commit
                                                    ^------------/           /
                                                    ^---------------------
                                                      typical forwarding
                                        

- Spectre
    - the processor will do things we're "not allowed" to do in speculative execution
        (speculative execution = running instructions before we're sure they're actually supposed to run,
            typically from branch misprediction or not detecing exceptions yet)
    - what's supposed to happen is that the processor completely cleans up everything
        that we're not allowed to do so it looks like it neve happened
    - BUT problem --- the processor doesn't completely cleanup some hardware structure
        - one example: data cache
            - the processor won't have actual values that we computed in the data cache
            - but it will have evicted things to have space to load values that
                were accessed in the "not allowed" computation
        ^applies to both Spectre and Meltdown
    - Spectre variant 1:
        - "not allowed" = some code that's protected by an array bounds check
            [typically in kernel or some other "privileged" software]
        if (INPUT < array1_size) 
            access array2[array1[INPUT] * something]

        - branch prediction can assume that INPUT < array1_size is true
            (especially if it was true recently)
        - the processor will start access array2[INPUT * something]
            part of that process is finding space in the data cache to put that
            --> evicting something else
        - the processor will not undo that eviction when it detects
            the branch was mispredicted
                (even though it will discard all the values from registers, etc.)
        - we can learn about array1[INPUT] based on which part of the cache was evicted
        - because array1[INPUT] is out of bounds and we can choose INPUT,
            this can be some value we're not supposed to have acccess to that
                we (the attacker) choose
                    (we need to learn what the memory layout is to choose INPUT)

- determining information from cache access times
    - equations for unknown value
    let's say array2[0] maps to cache set X, offset 0
    then array2[0 + K * BLOCK_SIZE / sizeof(array2 element)] maps to cache set X + K MOD #SETS
    let's say that we learned that array2[array1[INPUT] * Q] evicted from set Y
        --> X + K (mod #sets) = Y

        K * BLOCK_SIZE / sizeof(array2 element) = array1[INPUT] * Q

    solve for K = array1[INPUT] * Q / BLOCK_SIZE * sizoef(array2 element)
    then continue to solve for array1[INPUT]

    ~ learning about index bits of chosen value