- flow of system call [user library code] (xv6: "usys.S") "system call wrapper" -- provides C functoin interface to syscalls contains the special instructoin to cause an exception recall: exception = enter kernel mode does any special argument setup --- OS's choice of calling convention for syscalls xv6: normal arguments (ones from C) passed the same as C does normally --> on the stack (x86-32) system call number passed through %eax have to pass this somehow --- no special parameters to "invoke syscall" instruction (unlike a normal function call, where the function call instruction includes what to call) [generic trap handling code] (xv6: trapasm.S and traps.c) handles saving user registers so kernel code can do what it likes xv6: saves to kernel stack stores what type of exception happened ("system call") [because used for more than syscalls] and later looks that up to invoke correct handler [system call dispatch code] (xv6: syscall.c) uses system call number to invoke correct handler [particular system call implementatoin (xv6: sysproc.c/sysfile.c) retrieves arguments from user stack actually does requested operation returning from syscall goes through in reverse order - arguments (F2019 Q1) - counting context switches a context switch happens when we need to run an application we don't need to context switch to: just make it so we can run an application later [***] clear out a process control block handle I/O --- exception handler runs in whatever program is currently running [***] etc. context switch != mode switch (mode switch: kernel versus user) ` - context switching and stacks in xv6 and many other OSes: saving user registers is handled when exceptions happen works the same way no matter whehter there's a context switch or not same idea as why we save registers for a functoin call when we decide to do a context swithc, we've already had an exception happen so we don't need to do anything extra to save user registers do need to save kernel information for actual switch operation (and restore it from the destination process) restoring the user informatoin is done using the same we'd return from exceptoin if no contxt switch same idea as returning from a fiunction call that saved registers in xv6: context switches pushed kernel registers onto kernel stack (and that's where kernel registers are saved when thread not running) entering the kernel for an exception pushes user registers on kernel stack (and that's where user registeers are saved when thread not running) - fork: what code runs, what values are copies EVERYTHING after fork, there are two processes and nothing is shared between them int global = 200; int function() { int local = 100; int *local_pointer = malloc(1000); pid_t pid = fork(); *** // use local, global, etc. } the code labelled *** will run in two processes (if fork() works --- enough memory, etc.): in the parent and in the child it will have separate copies of the global and local and local_pointer and *local_pointer so there is no way that what the parent does with global and local and local_pointer and *local_pointer after forking will affect the child and vice-versa because we have address translation, the parent and child will have the same virtual addresses allocated and the same virtual addresses for everythig immediately after fork(). ----- same is true of set of file descriptors (though they'll still point to the same files/pipes/etc.) - MLFQ: estimating how long things will run - quiz 4 Q 2: estimating how long until things start running * idea of MLFQ is that things with short-CPU bursts settle at a higher priority than things with a long CPU burst. --> back-of-envelope model is: * pretend everything has static priorities * look at how much time is left for program A after eveything with higher priority than A runs as much as it wants to * don't need to consider things at lower priority than A in figuring out how much A runs need to extra work if: * might have multiple things at same priority * priorities are fluctuating * it matters exactly when programs become runnable relative to each other --> non-back-of-the-envelope simulate what happens. Track priorities of everything and say "this runs for X sms", then ... * for things with flucuating priorities: * time quantum limits how long they can be higher priority * e.g. if program B can sometimes be at higher priority than A, it can only do so for (time quantum at the higher priorities) amount of time each time it gets to run - monitors/mutexes versus semaphores * monitors = mutex + CVs + data can accomplish the same thing that semaphors + data can do * semaphores combine critical section functionality (= mutual exclusion) and wait-for-event functoinality versus monitors: locks for crit. sect. and CVs for wait-for-event * semaphores have memory: remmeber a count of number of things that are allowed to wake up CVs: just deal with things that are already waiting waking something up on a CV now won't affect that start waiting later * common semaphore patterns: lock: semaphore with init value 1 lock = wait/down unlock = post/up "gate": --- kinda like CVs semaphore with init value 0 let a waiting thread go = post/up wait for someone to let me go = wait/down resource tracking: reserve resource (or wait for it to be available)= wait/down free up resource = post/up * common monitor patterns * lock when manipulating shared data * while (we can't do what we want do) cond_wait(some_CV, some_lock) NEED a while loop because: spurious wakeups (cond_wait can return for "no" reason) another thread could run in between when we stop waiting and when get the lock back example: if we're trying to dequeue, the other thread could dequeue first even though we were woken up to dequeue consequence of not having Haore-scheduling signal does not "hand off" lock * if (a thread might be able to do what it wants to do) cond_signal/brodcast(some_CV) - reader/writer locks - commonly it is okay to have multiple read something at the same time but only if no thread is writing/modifying it - reader/writer locks are an extension of locks that support this well - have two types of lock operations ReadLock: wait until there are no writers WriteLock: wait untl no readers or writers - considered a reader/writer when you hvae a Read/WriteLock and havne't unlocked - Q12 S2019: fill-in-the blank with monitors reader/writer lock where we have a maximum # of allowed readers AND with writer-priorirty new condition for ReadLock: wait until there are no writers AND there are less than the max # of allowed readers AND no writers are waiting for th elock with monitors: ReadLock() { .... while (writers OR max # of readers OR writer waiting for the lock) { cond_wait(reader_cv); } } ALSO need to make sure the CV is signalled when this new conditoin changes: for example: ReadUnlock() { ... if (no writer waiting for the lock) { cond_signal(reader_cv) // because we decreased the # of readers } else if (writers might be able to go) { cond_signal(writer_cv) } } Why only check no writer waiting? (1) if no thread is waiting to read, then cond_signal will notice that there's no thread waiting and be very fast (2) it's possible for two threads to ReadUnlock() before any waiting thread finsihes ReadLock() C: ReadLock() D: REadLock() C: wait # read D: wait # read A: ReadUnlock() signals C B: ReadUnlock() --> should singal here or else D won't wake up rule of thumb: make sure you think about "what if the thing I signal doesn't run immediately" - data races - writing a value while another thread might be reading/writing it - the result of read or write might varying depending on the exact timing (thread scheduling, how the compiler reorders things, how the processor reorders things) --> often (but not always) leads to a bug example: writing a map (typically red/black tree) while another thread is reading it the write might update a pointer and delete the original the read might use the deleted pointer - deadlocks - circular waiting - "I'm waiting for someone who's waiting for me" - most common is I've locked a lock A, and am trying to get lock B while someone else locked lock B and is trying to get lock A ... but can have less direct circular dpeendnecies (e.g. more than two threads, not just locks) - not the only cause of threads hanging - not checking for a condition properly before pthread_cond_wait'ing can cause hanging but not a deadlock (not supposed to be waiting) - starvation not actually a circular dependency e.g. if you had a thead busy-waiting that always got to run instead of threads actually doing computation - running really slow --- not a deadlock since deadlock requires intervention to stop - virtual memory: VA -> PA step one: split up the virtual address into pieces: [ Virtual address ] [ virtual page number ][ page offset ] ^^^^^^^^^^^^^^^^^^^^^^---- big enough to have a unique # for each byte in page e.g. 2^12 byte pages --> 12 bits [ VPN pt 1 ][ VPN pt 2 ] ^^^^^^^^^^^^^^------------------ index into the first-level page table ^^^^^^^^^^^^^^-- index into the second-level page table size of VPN pt 1/2: big enough to have one value for each possible index e.g. if 2^10 entries in the page table --> 10 bits step two: lookup first -level page table base pointer --- beginning of array of 1st level PTEs acesss eleement that_array[VPN pt 1] address of that_array + VPN pt 1 * sizeof(PTE) PTE will contain: * valid bit --- is there a second-level page table (if no: then stop!) * other metadata * physical page number convert physical page number to a physical address by concatenating an all-zeroes page offset (or by multiplying by page size) [convert from pages to bytes] step three: lookup second-level same process as step two but: use physical address we got from the PTE in step two use VPN pt 2 instead of pt 1 step four: take the physical address we got from step three, add the page offset from the original VA result: PA that will be accessed - page faults: when do they happen - if we get to the "then stop!" above - other metadata in PTEs says "not writable/readable/etc." and we're trying to do a write/read/etc. sometimes called a "protection fault" instead of "page fault" - copy-on-write - want to: copy data, but not right now - mark the data read-only - we will get a page fault when it's accessed - copy when that page fault happens gaurentee: before the data is modified [and we lose our chance to preserve the original value] - relies on page faults letting us change the page table and retry the operation - CFS key idea: virtual time ~ scaled actual time a program has been run main goal: fairness in virtual time --> highest priority to run = lowest virtual time secondary goal (1): don't let things that have NOT been run for a very long time get a huge advantage example: if I walk away from my email client for 1hr, it shouldn't get ALL the cpu time when I come back solution: adjust virtual time with in a limit when things come back after waiting a while secondary goal (2): manage context switch overhead solution: don't switch to lowest virtual time so often that we're always context switching set minimum time quantum based on # of things running/parameters