- kernel v user mode - some operations are "dangerous" changing the page table base pointer if you allow anyone to do this, they'll corrupt other progranm's memory accessing the hard drive if you allow anyone to do this, they'll corrupt other user's files (or read files they aren't supposed to) changing the exception table pointer if you allow anyone to do this, they might hang the system accessing pages marked as "kernel mode only" in the page table - kernel mode is processor support for limiting these dangerous operations - dangerous operations in kernel mode: work - dangerous operations in not-kernel mode = "user mode": cause an exception - exception handlers run in kernel mode *** only reason we switch to kernel mode - returning from an exceptoin handler restores prior user mode. - does somethig happen in kernel mode (1) is there are a reason for an exceptoin to happen e.g. is something going to make a system call is something going ot access invalid memory etc. - example: decrementing stack pointer --- no reason for exception if OS allocates stack memory on demand, happens when program tries to use that memory - example: malloc needs to change the page table sometimes make system call --> enter kernel mode - example: write to a pipe need to access kernel's data structures about what's in the pipe, probably - system call arguments - on the user stack (because of the 32-bit x75 caling convention) - system calls read by looking up the user stack pointer - system call number --- in user eax (lookup from trapframe) just a convention choice - trapframe location - when an exception happens, xv6 saves the trapframe on the kernel stack ^^^^^^^^^^^-- of the thread that was running xv6 kernel never puts things on the user stack (exception: if user passed stack pointer to read(), etc.) - context switches / where data is stored - in xv6, context switch always happens after saving state for an exception user data stored in trapframe on the kernel stack - in xv6, context switches always switch to another thing runnning in kernel mode if switching to another process, its registers were saved by a prior exceptoin on its kernel stack - what system calls context switch - when the current program can't keep running example: waiting for a keypress - if a system call explicitly requests one our version of xv6 "yield" - kernel buffering typically, calling read() or write() doesn't fetch data directly from the * hard dirve * process at the other end of the pipe * network instead it fetches data from a buffer that kernel maintains reasons: - data for read() could come in before you call read() - what other process or hard drive or etc. send the OS is not the same amount that you read e.g. hard drive gives you 1024 bytes, but you read 700. OS should keep all 1024, so when you call read again, it can use the remainder - data for write() can't make it to the destination right away OS will keep some extra, but not too much --- write() will return less than you asked for if not enough space - stdio/iostream buffers are IN ADDITION TO kernel buffers cout << .... ---> [cout's buffer] ---> [kernel's buffer] ---> [actual destination] - MLFQ multi-level feedback queues use priority scheduling to approximate SRTF (shortest remaining time first) high-priority = short CPU burst * enforce that high-priority proces have short CPU bursts by giving them a small time quantum and moving them to lower priority if they use the whole thing lower priorites get larger and larger time quantums * move low-priority processes that don't use their whole time quantum to higher priority intuition: proceses should end up "around" the priority level with the time quantum that matches their CPU burst (probably actually end up alternating b/c they'll use less than priority X and more than X+1's.) - which scheduler when - throughput problem: too many context switches problem: complicated scheduling decision first-come first-served - turnaround time/wait time problem: need to prioritize short CPU bursts shortest remaining time first (if we can do preemption) - "fairness" - equal chance to run as anyone else when you can run round-robin lottery with equal ticket weights - equal amount running as everyone else CFS-like virtual time [Linux's Completely Fair Scheduler] - "weighted fairness" weight 2 --> you should run twice as much - for running when you can run lottery (twice the ticket count) - for amount of CPU time over time CFS-like virtual time dividied by your weight - some thigns are more important --- priority - deadlines --- real-time schedulers like earliest deadline first - #4 S2019 --- POSIX pipes, dup2 int fds[2]; pipe(fds); pid_t pid = fork(); if (pid > 0) { dup2(fds[0], STDOUT_FILENO); *** parent >> make STDOUT_FILENO refer to the read end of the pipe >> fortunately, we never use STDOUT_FILENO after this, so it >> doesn't matter >> I could do read(STDOUT_FILENO, ...) to read from the pipe >> (probably not useful) >> write(STDOUT_FILENO, ...) ---> error "that's read file descriptor" >> remember: the array of file descriptors is separate for each proces waitpid(pid, NULL, 0); *** parent } else { write(STDOUT_FILENO, "Hello!", strlen("Hello!")); *** child >> write Hello! tos tdout write(fds[1], "Bye!", strlen("Bye!")); *** child >> write Bye! to the pipe -- if no room pipe, this hangs (b/c parent doesn't read it) -- if room in pipe, it doesn't go naywhere exit(); } - file descriptors - array of pointers to file information - dup2: copies pointer from one lement to another - close: NULLs an element (and does appropriate garbage collcetion) - open, pipe, etc. use free array (currently NULL) array elements - per-process - exec doesn't change the open files (as far sa we've discussed) - #5 F2018 --- pthreads possible outputs assuming no load/store reordering int x = 50; void child(void *ignored_argument) { // <-- (A) x = x + 10; // (B) ---> temp <- load x // <-- (A) <-- if here, the x = x-1 is "lost" ---> temp <- temp + 10 // <-- (A) <-- if here, the x = x-1 is "lost" ---> store temp into x // <-- (A) printf("[child]␣%d␣", x); // (C) // <-- (A) x = x + 10; // (D) // <-- (A) ---> temp <- temp + 10 // <-- (A) <-- if here, the x = x-1 is "lost" ---> store temp into x // <-- (A) <-- if here, the x = x-1 is "lost" // <-- (A) } int main(void) { pthread_t child_thread; x = 100; pthread_create(&child_thread, NULL, child, NULL); x = x − 1; // (A) pthread_join(child_thread, NULL); <--- printf("[main]␣%d", x); // (E) return 0; } // E runs after A, B, C, D // D runs after B, C // C runs after B possiblities to consider: A: happens before B A: happens during B or D ---> effective A's result is lost A: happens after B, before C A: happens after C, before D, A: happens after D A happens overlapping with B/C/D example: A reads before B, but stores after D - mutex implementation (one possibility) - struct: spinlock [always locked before using list of waiting threads/status] list of waiting threads status: "locked or not" - lockMutx(): lock spinlock chcek status if status = locked: add self to waiting threads unlock spinlock and go to sleep otherwise set status = locked, unlock spinlock - unlockMutex() lock spinlock if waiting thread wakeup waiting thread else status = not locked unlock spinlock - mutex usage choose a mutex that covers each thing you share -- in a monitor this state includes "do I need to wait" -- in a monitor: locking mutex, means that you know you won't suddently not need to wait after you decide to wait lock before accessing the thing you share unlock afterwards - rwlock usage mutex can only be held by one thread, but this seems wasteful since multiple threads can read something at the same time read-lock: "lock for reading" multiple threads can do this at once write-lock: "lock for writing" one thread a time don't allow other writers or readers to act at the same time - rwlock implementation track the number of readers/writers use some sort of simpler lock to protect your tracking of th enumber of readeres/writers use the tracked # to decide when to wait or not (pthread_cond_wait or sem_wait) if you care about whether readers or writers go first, also track the # of waiting readers or writers and decide whether to wait tkaing that account (e.g. if writers go first, readers check if there are waiting writers) - semaphores - have a count ~ # of slots available - operation: "down"/"wait"/"P" --- reserve a slot or if one is not available, wait until it is - operation: "up"/"post"/"V" --- make a new slot available (which might wake up a thread that previous asked to reserve one) - intuition: what number can we track where I want to wait when the number is 0 - example: # of free space sin a buffer - common pattern: "mutex" semaphore: 1 when "unlocked", 0 when "locked" - common pattern: "gate" semaphore: start at 0, call "down" to wait for another thread to let you go, and other thread calls "up" (choose how many threads go by calling up the right number of times) - calculating page table addresses, etc. - virtual address, page table base address, (whatever's in the page tables) - first step: split the virtual address [ virtual page number ][ page offset ] [ VPN part 1 ][ VPN part 2 ][ page offset ] ^^^^^^^^^^^^^^^^^^^^ big enough to identify a byte in a page log_2(page size) - use the first part as an array index into the first-level page table index 0 of the first-level page table = page table base addrss (usually stored in a registe rwhen a process is running) index 1 = base address + sizeof(page table entry) index 2 = base address + sizeof(page table entry) * 2 index (VPN part 1) = base address + sizeof(page table entry) * (VPN part 1) - first-level page table entry gets you base address for the second level (but usually as a page number, not a byte address) [ physical address ] [ physical page number ][pag eoffset ] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ then follow same math, but with a different base address and the next part of the VPN - once you get the last-level page table entry, take its physical page number and combined with the original virtual address's page offset - each entry in the first-table used for range of virtual page numbers (which have the same VPN part 1) - each second-level table is shared by virtual page numbers that have the same VPN part 1 - deadlock/hanging with read/write if we write() and no one is reading, then that write can run out of space and with pipes, it will wait for space (with default settings) if you can't show someone's reading, we might have a problem if you can show that it's definitely the case that no one is reading, then you almost certaily have a problem - is wait for keyboard input implemented with a CV in the kernel? - sometimes (but means the kernel needs its own CV implementatoin) - or sometimes a semaphore - fflush: empty the printf buffer - event-based programming rather than having thread A read/write/ from X and thread B read/write from Y have one thread which asks for "what's the next thing I can do with X and Y" in loop - pool assignment