- logistics: what's on the exam - everything covered up until now in lecture - some questions on virtual memory - definitely nothing on page cache - partial read()s and write()s - POSIX API: read(file descriptor, buffer, amount) reads UP TO amount bytes sometimes don't have amount bytes available: ~ just make the read fail ~ just return what you have <---- POSIX's choice ~ wait until you have everything POSIX strategy (for file descirptors like pipes) - wait until I have at least one byte - then return whatever I have - POSIX API: write(file descripor, buffer,amount) writes UP TO amount bytes sometimes can't actually send amount bytes right away: POSIX strategy (for file descriptors like pipes) ~ wait until I can write at least one byte ~ write what I can now ~ then return how much I've written - kernel versus user mode (what happens in each) - kernel mode is a processor feature - processor only allows certain things in kernel mode: ~ most I/O (e.g. x86's in/out instructions; special memory locations; etc.) ~ setting up exception handling ~ changing the page table base register ~ accessing virtual pages marked as kernel-mode-only in the page tables - processor supports switching to kernel mode ONLY via exception/trap/interrupt/etc. - convention in Unix-like OSes: ~ handling system calls, scheduler, I/O runs in kernel mode ~ ***all*** user code runs in user mode - context switching - context: the stuff we need to run on the CPU for a thread ~ minimally: registers + PC - context switching operation: save registers (including stack pointer) + PC of old thread restore registers + PC of new thread - xv6: swtch() function --- only deals with kernel registers save registers on the old kernel stack restore regsister from the new kernel stack - xv6 only switches between programs in kernel mode - this means that user registers are saved because of an exception and NOT because of a context switch - context switching between processes also need to save/restore the address space save old page table base register [xv6: it's already stored in the struct proc] restore new page table base register [xv6: load from struct proc into the register] - xv6: separate step from the kernel context switch (first swtch() to change kernel stacks, then a different function switchuvm() to change page tables) - differences between scheduling algorithms - preemptive versus non-preemptive preemptive means when a program becomes ready, we run it right away non-preemptive means only stop a program when it gives up the CPU - scheduling goals: - "fair" sharing CFS ("Completely Fair Scheduler") --- medium-term equal CPU usage medium-term ~~ "credit" for not needing the CPU (or proportional to weights) RR --- everyone gets a turn (if runnable) short-term ~~ can't use the CPU, you lose that time lottery --- proportional to weights, random choice short-term ~~ can't use the CPU, you lose that time - throughput --- minimal overhead (context switch overhead + scheduling decisoin time) FCFS (first-come first-served) / FIFO just have a list, run the next thing until completion - mean turnaround/wait time optimizing --- interactivity-optimizing --- try to get I/O to happen sooner SJF/SRTF --- run the shortest CPU burst (to do I/O faster) MLFQ --- try to do SJF/SRTF based on measuring CPU bursts because we can't actually know CPU bursts in advancek - meeting deadlines Earliest Deadline First - other goals Priority - process versus thread resources - thread ~~ what we need to run on a CPU - program counter - CPU registers (x86: EAX, EBX, ...) --- - stack (x86: whatever ESP points to) - process - process ID - current directory - one or more threads - file descriptors shared between all threads - address space (heap, code, global variables, etc.) shared between all threads contains multiple stacks (threads can get pointers to other thread's stacks) - order of operatoins with multiple threads -- Q5 on last semester's exam int x = 50; /* <---- shared between the threads! */ void child(void *ignored_argument) { x = x + 10; printf("[child] %d ", x); x = x + 10; } int main(void) { pthread_t child_thread; x = 100; pthread_create(&child_thread, NULL, child, NULL); x = x - 1; pthread_join(child_thread, NULL); printf("[main] %d", x); return 0; } assuming no reordering, etc. --- child is joined before main outputs --- a. [main] 110 [child] 119 *a. [child] 110 [main] 109 child: x = x (100) + 10 main: load xs for x -1 child: print child: x = x (110) + 10 main: computes x - 1 = 109 main: store x main: print --- no way of adding 10 or subtracting 1 to get 120 --- a. [child] 109 [main] 120 we know value is 109 when child prints and only + 10 and -1 can run after that --- x starts at 100 before child runs --- eliminates: a. [child] 60 [main] 99 *a. [child] 110 [main] 119 child: x = x (100) + 10 child: prints child: x = x (110) + 10 main: x = x (120) - 1 main: prints *a. [child] 110 [main] 99 main: read x for x = x - 1 question says loads/stores aren't reordered but we have to load, then subtract, then store child: x = x (100) + 10 child: prints child: x = x (110) + 10 main: compute x - 1 fromprevious read value main: store x -1 in x main: print ---- child is joined before main outputs --- a. [main] 110 [child] 109 *a. [child] 109 [main] 119 main: x = x - 1 --> 99 child: x = x + 10 --> 109 child: print child: x = x + 10 --> 119 main: print - cache coherency ~~ quiz question A: invalid / B : invalid (b/c question said so --- not cached) invalid = not cached Q: how many updates/invalidations for A (which only redas) Processor A reads X A: SHARED / B: Invalid Processor B reads X A: Shared / B: Sharedk Processor B WRITEs X A: INVALID / B: Modified (cached copy that's not up-to-date with memory) Processor B WRITEs X A: Invalid / B: Modified (already Modified, no work -- only B has a copy) Processor B writes X A: Invalid / B: Modified (already Modified, no work -- only B has a copy) Processor B writes X A: Invalid / B: Modified (already Modified, no work -- only B has a copy) Processor A reads X A: SHARED / B: Shared or Invalid - virtual to physical address translation take our virtual address: split it into Virtual Page Number and Page Offset how big the page offset is? # bits = log_2(size of page in bytes) split Virtual Page Number into parts for each level most significant bits --> first-level least significant bits --> last-level do an array lookup based on split Virtual Page Number at each level each page table is an array of page table entries like arrays in C: if array is at address 0x50000 then element 0 is at address 0x50000 and element 1 is at address 0x50000 + sizeof(array element) and element 2 is at address 0x50000 + sizeof(array element) * 2 ... first-level page table entry contains the PHYSICAL page number of the second-level physical -- because we don't want to do recursive lookups for virtual to physical converting physical page number to a physical address of a particular byte add page offset of 0 (same as multiply by page size) once we find the last-level page table entry contains the PHYSICAL page number of the data combine physical page number with page offset from original virtual address 15 --> 015 ----- - race conditions avoiding - one way which works (but might be more conservative than neeeded) if you have shared data, always lock before making changes and never assume anything about what happened while you didn't have it locked - avoiding busy-waits - scenario: waiting for an event X to happen naive (slow busywait) check if it happened, check if it happened, check if it happened... - two primitives to help: - condition variables: ~ thing we need to wait for MUST be protected by a lock (hopefully doing anyways to avoid race conditions; see rule above) ~ check if it happened, if not release lock + start waiting * extra task: whatever another thread makes X happen, it needs to broadcast (or signal) us ~ recheck after waking up the condition (if nothing else, b/c of spurious wakeups) - semaphores: (analogy: "opening a gate") if you know someone's always waiting: - semaphore that's intialy 0 - wait (down) on semaphore * extr task: up on semaphore (always) if you don't know someone's always waiting: ~ semaphore that's initially 0 ~ check if we need to wait ~ mark that we've started waiting ~ release lock ~ wait (down) on semaphore * extra task: if someone's started waiting, then post (up) on semaphore - implementing spinlock - simplest strategy: atomic operation that reads a value and then writes LOCKED if we read UNLOCKED, we know that THIS THREAD set it to LOCKED "test and set" - signal versus broadcast - signal means that at most thread goes per time this event happens tricky part: need to prove the "at most one" UP on a semphore: yes, at most one theard 0 -> 1 on a semaphore: not necessairily, because 1 -> 2 isn't accounted for (might be waking up less things that need to go) - question: when will second thing waiting when you signal be woken up is that really gaurenteed? - copy on write - copy one page or multiple pages - OS only needs to copy one page - it has the option of copying more, maybe?