- exam coverage for things in reading versus lecture (e.g. DHCP) - we intend the exam to only cover things that were in lecture or assignments - things people struggle with on the exam - I'm bad at guessing - historically Spectre, VM stuff, synchronization - system calls and keyboard input and displaying output - three entities involved: hardware versus OS versus the "user" program running - hardware usually runs just the current thing EXCEPT: sometimes the hardware needs to help switch to the OS [exception] - for keyboard input: the hardware will help switch to the OS when something else is running and the keyboard input needs to be handled - for screen output: probably don't need hardware to notify OS, but maybe to say when screen is ready - timers: the OS might want to set itself up to run later - typically, we don't allow "user" programs to directly talk to any input/output device - any operation that involves talking to an input/output device must be done by the OS - the OS won't magically know what the program wants, so the program will make a system call to do that - sometimes, when the program makes a system call, the OS will already have what the program wants available example: keypresses happened early, and now the program wants to read them example: can put some on the screen right now - sometimes, when the program makes a system call, the OS won't have what the program wants example: waiting for user to press some keys the OS can record that the program is waiting and possible run other things - F2025 Q4e [number of bytes transferred by the cache] - writing 4 bytes to a cache block not already in the cache and the cache hsa a write-allocate, write-back policy - in this case: nothing to write back (the replaced block is not dirty) - and we need the cache to end up storing: the 4 bytes that were written PLUS the (block size - 4) bytes that make up the rest of the block - optimization: we can just read the part of the block we aren't overwriting - how different protocols relate to each other - layering model: application layer stuff implemented using transport layer stuff transport layer stuff implemeneted using network layer stuff ... - sockets: give you the "stream" interface associated with IP address + port numbers API to use the transport layer - OSI layers - OSI model has 7 layers - we arne't teaching the OSI model because it's not actually the Internet uses: - application (= OSI layer 7) ~ we have some meaning for messages/streams of data ~ example: HTTP --- requests for documents and their contents ~ example: SSH --- terminal sessions and commands and things to dispaly from them - transport (= OSI layer 4) ~ we want to direct messages ("segments" or "datagrams") to specific programs ~ we want to tie messages together into a stream of bytes [* with TCP, not UDP] ~ we want to send those messages reliably [* with TCP, not UDP] - network (= OSI layer 3) ~ we want to forward messages ("packets") between different local networks - link (= OSI layer 2) ~ divide those bits into messages ("frames") and figuring out who they're for - physical (= OSI layer 1) ~ put bits on a "medium" (radio signals, voltage on a wire, ...) - network address translation - IPv4 (still most common) has too few addresses - so we want to share addresses between many machines - solution: break the layer model a bit: - within a "private" network, each machine has its own IPv4 address - outside the network, all these machines use the same (or a small set) of IPv4 addresses - but -- still need to tell which machine each things is for - we use port numbers to do this /- all the machines on the private network are addresesd with private IP addresses / that will not work on the real iIntenret v [ private network ] ---- (router) |--- ["real" Internet] example: | ^ within an <- LOCAL | \-- all the machines on the private network are addressed with apartment the public IP address(es) assigned to the router (which there are fewer of than devices on the private network) - table: [private IP (local) address, private port number, public IP (local) address, public port number, remote IP address, remote port number] - when a packet comes from the private network, we lookup the (private IP, private port, remote IP, remote port) and edit the packet as it goes out to use the corresponding public (local) IP and public port - when a packet comes from the remote network we lookup the (public (local) IP, public (local) port, remote IP, remote port) and edit the the packet as it comes in to use the corresponding private (local) IP and private (local) port - table is filled in as new connections are made - certificates and digital signatures - digital signature: Sign(private key, message) = signature Verify(public key, message, signature) = 1 iff signature is "genuine" a valid (verifying) signature proves that someone with the private key produced it for this particular message typical usage: attest that something genuinely comes from some principal - certificate: message is "Jo's public key is X" - the public key could be for encryption or digital signatures and we have a signature from a "certificate authority" proving that the certificate authority attests to this information if we want to use a certificate, we have to trust the certificate authority (regardless of whether they're trustworthy) - when is encryption/authentication "safe" - S2024 final Q2 (secure channels)` asking about what a passive (eavesdropper) and active (machine-in-the-middle) attacker could do client -> server sends PE(server's public encryption key, command + passpharse + unique ID) server -> client sends unique ID + response, Sign(server's private signing key, unique ID + repsonse) questions were what attacker could do/learn: - key things: server sends back information unencrypted so an eavesdropper can learn evreyhting the server sends back - and this includes event names + dates +statuses in the question - if we could manipulate encrpyted data, then we might imagine changing the client's command - we accepted either answer on the exam because that's often not an asymmteric encryption problem - monitors (signal/broadcast/mutex_lock) - toolkit for "general" synchronization we want threads to sometimes wait for arbitrary things - general pattern - have some shared data that includes everything that we need to determine whether to wait - we always acquire a lock when accessing that shared data (so we don't have to worry about it changing in between us using it for a decision and applying that decision) - then: Lock(common lock) while (we need to wait) cond_wait("condition variable", common lock) [this releases the common lock while we wait] do something that we needed to possible wait for if (we changed whether other people need to wait) cond_broadcast to their condition variables Unlock(common lock) - "condition variable" represents a list of threads waiting for something (it doesn't actually know what the thing they're waiting for is; we have to be consistent when we cond_wait/cond_broadcast) - best practice: is one condition variable for each thing wemight want to wait for (but we don't have to do that) - need to double-check the condition after waiting because: - between when the broadcast happened and when we got the lock back something may have changed whether we needed to wait: - example: queue had an item on it, but another thread took that item before we go the lock back - "spurious wakeups" -- in corner cases, cond_wait implementation might wake up too early - we can signal instead of broadcast if we know that there's never more than one thread that might need to be woken up - example: producer/consumer, know that at most one thread needs to go everytime we produce an item if we always signal, not waking up too few threads - branch prediction methods - what we want you to know: - using recent history to predict next outcome - typically identify recent history based on addresses of branch instructions involved - OOO: decode versus in-order pipeline decode - decode in in-order is two things: - actually interpreting the machine code [in the original in-order pipelined processors, this was very simple] - reading the registers - decode in the OOO pipeline: - actually interpreting the machine code - OOO: where forwarding happens - Fetch / Decode / Rename --> [buffer] --> Issue / Execute... / Writeback --> [buffer] --> Commit ^------------/ / ^--------------------- typical forwarding - Spectre - the processor will do things we're "not allowed" to do in speculative execution (speculative execution = running instructions before we're sure they're actually supposed to run, typically from branch misprediction or not detecing exceptions yet) - what's supposed to happen is that the processor completely cleans up everything that we're not allowed to do so it looks like it neve happened - BUT problem --- the processor doesn't completely cleanup some hardware structure - one example: data cache - the processor won't have actual values that we computed in the data cache - but it will have evicted things to have space to load values that were accessed in the "not allowed" computation ^applies to both Spectre and Meltdown - Spectre variant 1: - "not allowed" = some code that's protected by an array bounds check [typically in kernel or some other "privileged" software] if (INPUT < array1_size) access array2[array1[INPUT] * something] - branch prediction can assume that INPUT < array1_size is true (especially if it was true recently) - the processor will start access array2[INPUT * something] part of that process is finding space in the data cache to put that --> evicting something else - the processor will not undo that eviction when it detects the branch was mispredicted (even though it will discard all the values from registers, etc.) - we can learn about array1[INPUT] based on which part of the cache was evicted - because array1[INPUT] is out of bounds and we can choose INPUT, this can be some value we're not supposed to have acccess to that we (the attacker) choose (we need to learn what the memory layout is to choose INPUT) - determining information from cache access times - equations for unknown value let's say array2[0] maps to cache set X, offset 0 then array2[0 + K * BLOCK_SIZE / sizeof(array2 element)] maps to cache set X + K MOD #SETS let's say that we learned that array2[array1[INPUT] * Q] evicted from set Y --> X + K (mod #sets) = Y K * BLOCK_SIZE / sizeof(array2 element) = array1[INPUT] * Q solve for K = array1[INPUT] * Q / BLOCK_SIZE * sizoef(array2 element) then continue to solve for array1[INPUT] ~ learning about index bits of chosen value