- logistics: - what topics may be covered - I'm looking through the course schedule to construct the exam - probably some topics on the schedule can't make it for space - approximately evenly distributed from weeks - ways to prepare - under "study materials" on the website are past exams+quizzes - in most previous semesters there was a midterm+final (probably the exam will be mix of midterm+final) - some topics from previous semesters weren't covered this semester - curve - the exam score will be the raw score - "Overall [whole course, not just exam] raw scores will be translated to final grades in a way to be determined (depending on the actual difficulty of coursework), but overall scores above 90% will be at least an A-, above an 80% at least a B-, above a 70% at least a C-, and above a 60% at least a D-." - open book/notes/etc. - yes, but individual --- no looking for quesitons asked after the exam was released, etc. - length - equivalent to around 30 MC - format - mostly MC or short answer similar to quizzes - small number of "write a few lines" questions - we'll expect work in the comments so we can give partial credit (will be looked at much more than typical quizzes) - thread versus process control block - thread ~~ "virtual" processor ~ its own registers ~ PC ~ general purpose registers (%eax/%rax, stack pointer, etc.) ~ stack pointer will point to own stack ~ but all the stacks in the process will be in the same page table, etc. ~ state related to scheduling ~ is it running, runnable, waiting for I/O, etc. ~ e.g. if lottery scheduler: the tciket count if CFS: the virtual time ~ kernel stack typically - process ~~ collectoin of threads ("virtual processors") + memory + I/O ~ one ore more thread control blocks ~ memory layout ~ page table ~ data about what's in the memory ~ list of mmap's ~ where the end of heap is ~ etc. ~ I/O access: ~ file descriptors (on Unix-like) ~ current working directory ~ user ID, group ID, etc. - kernel versus user mode ~ processor enforces that some things only "special" code can do ~ how do we know if code is special: ~ we track an extra bit in the processor: "are we in kernel mode?" ~ we flip this bit on when we do things that start the OS ~ most notably: handling an exception is assumed to run the OS ~ the OS tells the processor to flip the bit to off when it runs program code ~ what sort of operations are reserved only for special code? ~ talking to I/O devices ~ handling exceptions ~ including setting the exception table pointer (telling processor where exception handlers are) ~ changing page table base pointer ~ accessing memory marked in the page table as kernel-only - system calls ~ program asking the kernel (part of OS that runs in kernel mode) to do something for it ~ why isn't the program just doing it itself ~ b/c some operations require kernel mode ~ the program deliberate triggers an exception ~ exception handlers always in run in kernel ~ and what they do is setup by the OS only ~ the OS makes a decision that triggering exceptions in certain ways will make it do something on behalf of the program ~ something like a calling convention is chosen by the OS to have the program communicate what it wants the OS to do - pipes and file descriptors ~ Unix design, every process has a table of file descriptors: index in table --> "open file description" system calls expect an index into the table when you ask to read/write a file ~ Unix design: files represent not just files on disk ("regular files") but also most devices (keyboard, the screen, etc.) and "pipes" and "socketS" that can communicate with other programs ~ files provide the interface: read() + write() + close() ~ convention: index 0 --> stdin, index 1 --> stdout, index 2 --> stderr -- ~ a pipe is a special file where we write it, and then can read back what we wrote ~ pipe() library call creates a new pipe and gives us two file descriptors for it: ~ one file descriptor allows us to read from the pipe ONLY ~ one file descriptor allows us to write to the pipe ONLY ~ when we write to a pipe, the kernel will keep a small amount of written but not-yet-read data in memory ~ if it runs out of space, it just waits for someone to read() to free up space - CFS -- scheduling decisions ~ "completely fair scheduler" ~ idea: to give every thread an "fair" amount of virtual time ~ fair = same after taking into account any weight assigned to thread ~ identify who (that can run now) is furtherest away from having their fair share of virtual time ~ let them run ~ virtual time = amount of time you've had the CPU ~ with adjustments for time you weren't runnable ~ and possibly for your thread being more than others (as configured by sysadmin) - copy-on-write ~ normally: we'd copy something (for example) when fork() is called ~ instead: let's not copy it until the last possible moment: ~ when is the last possible moment: ~ when someone tries to modify one of the copies ~ before this --- we can just use the original version ~ solution: set all the "pretend copies" as read-only ~ if the processor complains that a program tried to access the read-only thing ~ then actually copy it ~ when we do the copy, we can copy as little as the page the program was trying to access (to make that access work --- and delay the other copying until later) - mmap ~ make it appear that a file is part of your program's memory ~ two modes for mmap: ~ give you a "private" copy of the file ~ as if we copied the file (or part of it) into you memory ~ but we can actually implement this with copy on write ~ give you "shared" access to the file ~ when you access your memory, it accesses the "real" file ~ how? the OS caches the file (temporarily) in memory ~ ... and gives you direct to that cache (through your page table) ~ in both cases: ~ the OS doesn't need to set the page table entries immediately, it can just remember where the mmap area of memory is, and set the page table entries when you try to use that area of memory and the processor complains it is not setup yet - page cache ~ we keep files and program data "temporarily" in memory but they have a permanent locatoin on disk ~ for non-file data: we make up a a locatoin on disk ~ probably for performance, we'd like to use the location on disk not very often ~ this means we need to be able to remove something that's temporarily in memory from memory ~ two cases: ~ the thing in memory is the same as on disk --> change bookkeeping information to know it's in memory anymore ~ the thing in memory is NOT the same as on disk --> we need to first write the changes in memory to disk (so we don't forget them) --> we track if things in memory have changed --> when some page is written we set a "dirty bit" to mark it as changed-but-not-saved - set-user-ID ~ normally: ~ run a program --> it has the same access as the program that starts it (and any other program I run) ~ set-user-ID programs are ones where want some programs to have extra access ~ usually because the system administrator wants to have a way for users to do things that their normal can't/shouldn't do ~ example: if we want to let you load a USB stick ~ this requires normally restricted access to the filesystem system calls ~ system administrators creates a program owned by a privilged user (usually UID 0) and marks it as set-user-ID ~ then, when the program is executed, the kernel records its user ID as the owner (usually 0) --> that program can do things that other programs executed the same can't ~ the system administrator needs to make sure that the program only does things that they are okay: e.g. maybe it lets you load USB sticks, but not change how the normal hard drives are configured j ~ 2021 final Q12 ~~ pthreads CVs ~ [note: I need to link up the 2021 final to study materials page] ~ broadcast to wake up from WaitForState() ~ also need to wait for done to become high enough ~ code in question was a missing a pthread_cond_signal from WaitForState to wake up the SetState thread ~ while (done != expect_done) pthread_cond_wait(...) <-- need cond_signal/broadcast to wake up the thread from the cond_wait ~ 2021 final Q13 ~~ deadlock ~ inconsistent lock order: CopyAllFilesIn: lock DIR + wait for file lock; MoveFileToDirectory: lock FILE + wait for directory lock; ~ page replacement policies ~ page replacement policy: when we have pages cached in memory and need to free up some space, what do we stop caching? ~ like scheduling there are multiple possible goals: we mostly talked about the goal of maximizing the hit rate / minimizing the # of replacements we'd have to do (other goals: fairness between programs; overall performance; ...) ~ optimal policy for maximize hit rate assuming we only replace things just when we need to load something new: ~ Belady's MIN: replace the thing accessed furthest in the future ~ usually least-recently-used like policies are a good approximation of this (but no gaurentee) ~ so we try to approxiate LRU by monitoring accesses to pages and choosing things that weren't accessed recently (multiple options for how to do this) ~ example: active list of pages which were loaded more recently and inactive list of pages that were on the active list for "too long" and check for accesses to the pages on the inactive list (possible intuition for why to do this? not worth your time to check pages that were recently loaded are still being accessed)` ---> ~ in addition, we can: ~ try to load pages before they're accessed (example: readahead: if process access 1, 2, 3, then load 4, 5, 6) ~ try to have heuristics to detect other specific access patterns (example: looking for accessing a file once and not using it again) ~ Q13 Q2 ~ set-user-Id option C: ~ normally when we run a program, even if it's owned by the system administrator it still runs when the same access that our normal programs have: ~ example: ls is normally only owned by the sysadmin, but when we run ls it can only access files we cannomrally access