- last time: - capabilities: token to access = ability to access - example: file descriptors - to make useful: provide way to pass between processes - permissions check done when acquiring capability (hopefully) - virtual machines: hardware / host+hypervisor / guest OS / application in guest OS - hypervisor tracks similar information to process control block: current registers; current memory; current I/O devices - hypervisor tracks extr ainformation to emulate HW's kernel mode: does guest OS believe it's in kernel mode? [guest OS is always in user mode on the real hardware] guest OS page table pointer guest OS exception table pointer ... - trap-and-emulate hope/make sure special operations in guest OS trigger exception I/O updating things that support kernel mode on the real HWS system calls ... in hypervisor/host OS exception handler: do what hardware would do if guest OS thinks it's in kernel mode, maybe privileged operation emulate talking to I/O device update emulated page table pointer switch from pretend kernel to pretend user mode reflect exceptions back to guest OS system call while guest OS is running --> triggers GUEST system call handler - shadow page tables build "shadow" page table for real hardware by combining guest + hypervisor PT add shadow page table entries on demand clear shadow page table in response to TLB flushes separate tables for pretend kernel versus not ----- - SSDs ~ trying to provide the same interface that hard drives provide but with very different hardware ~ the SSD underlying hardware (NAND flash) can read anything relatively quickly (pretty much no matter what was previously read) [unlike HDDs: head needs to physically move] write ONCE anything relatively quickly can erase to allow re-writing in large chunks only "erasure blocks" once you've written to a ~128K->1M block, you can't overwrite it until you erase the WHOLE THING also, if you erase blocks too much, they fail ~ SSD solution to provide the same interface requires dealing with erasure blocks problem: hard drives: I can overwrite on sector (512B-4K) at a time efficiently hard drives: I can overwrite a sector many many times and it doesn't change how much it fails significatly ~~ by copying data to make space to erase a large chunk --> the location of a particular sector (from the OS's point of view) changes in the physical media - inodes counting inode stores information about _a_ file [file = includes regular files and directory (root directory is a directory)] (size, location of the first few blocks of the file's data, who can access it, modification times, ...) - block counting, and {single,double,triple}-indirect blocks inode has pointers to the first few data blocks "direct pointers" [running example] example: inode has 10 direct pointers and blocks are 2K then bytes 0-20K of the file are found by following the direct pointers in the inode if not enough room in direct pointers in the inode, use more indirection single-indirect pointer example: inode has 1 indirect pointer then indirect pointer give the location of a data block that has 2K of stuff that stuff is more direct pointers if block pointers are 4 bytes, then there are 512 additional direct pointers then bytes 20K through (20K + 512*2K) = 1044K will be found by following the indirect pointer in the inode and then another direct pointer from there indirect pointers not enough --- add a double-indirect pointer example: inode has 1 double-indirect pointer then the double-indirect gives the location of a data block that has 2K of stuff that stuff is 512 more *indirect* pointers each of those 512 indirect pointers point to 512 additional direct pointers so we can find 512*512 = 262144 more direct pointers this way then bytes 1044K through (1044K + 262144 *2K) will be found by following the double-indirect pointer ... we can add triple-, quadruple-, etc. if the size we can support isn't big enough yet 1080K file? bytes 0-20K of this file are going to use the direct pointers > 10 blocks of data bytes 20K through 1044K are going to use the indirect pointer and fill the extra direct pointers that points to > 512 block of data > +1 block of pointers (the one pointed to by the indirect pointer in the inode) bytes 1044K through 1080K are going to use the double-indirect pointer but they only need the first indirect-pointer that points to and that first indirect-pointer is going to only need (1080-1044)/2 = 18 direct pointers filled > 18 block of data > + 1 block that the double-indirect pointer points to (filled with indirect pointers, but only the first one is used) > + 1 block pointed to by the first indirect pointer in the block pointed to by the double-indirect ptr (has space for 512 direct pointers, but only 18 of them are used) --> 10 + 512 + 18 blocks of data + (1 + 1 + 1) blocks of pointers (not all filled) ^^^^^^^^^ --- 6K of extra stuff - extents instead of storing a list of blocks we store a range of blocks (or a list of ranges of blocks) the range of blocks is called an "extent" typically a good idea for very large files challenge: need to allocate consecutive blocks example: for 1080K file above, if we found 1080/2 = 540 data blocks that were free and next to each other we could just store one extent (probably in the inode) to point the file's data - sockets - interface of two-pipes to access the network (i.e. communicate between machines) - once setup: write on one end and read on the other [or vice-versa] - setup: client: socket() --> get a file descriptor getaddrinfo() --> translate a hostname + service name (virginia.edu, http) into a IP address + port number ("socket address") connect() --> associate your file descriptor with that remote socket address server: socket() --> get a "server socket" file descriptor [which won't be the one we use to talk to the client] getaddrinfo() --> translate a hostname + service name into socket address bind() --> associate the "server socket" file descriptor with that address listen() --> tell the OS: I expect someone else to connect() to me accept() --> using "server socket" file descriptor, get a NEW file descriptor that is setup to talk to a particular client --- can call accept() multiple times to talk to multiple clients --- separate file descriptors for each client: no confusion about who we're write()ing to - stateful/stateless - stateless servers don't remember things about their clients between requests example: file server --- doesn't know/care what the clients have open just cares about getting the data/writing the data when the client wants to read/write/etc. example: web server --- if it's just sending web pages, shouldn't care what web pages the client requeste d before/after "each request stands alone" > less work for the server or client when dealing with server/client crashing - stateful servers need to remember things between requests example: SSH --- needs to make sure the command you type now and the one you type later go to the same shell session --- needs to remember what user you logged in as in this session [each session has multiple requests that connected to each other] - quiz 12 Q 3 [failure cases in RPC] Suppose network failures involving messages being lost or reoredered can occur and machine failures can occur according to fail-stop model. Which of the following are possible? [yes] sending from a client an RPC that changes something on the server but having it appear to fail on the client the network cable is unplugged after the server gets the request but before it can send an acknowledgment yes, the server retries sending this, but the network can be more patient than the server [no] sending from a client an RPC that fails to reach the server but having it appear to work on the client setup for the question: client assumes RPCs fail unless there's an acknowledgment [yes] having an RPC appear to be processed correctly on the server but fail on the client [this was a poor choice because I wasn't clear whether "processed correctly" included sending the return value back -- I'm assuming not] the network cable is unplugged after the server gets the request but before it can send an acknowledgment yes, the server retries sending this, but the network can be more patient than the server [yes] having an RPC appear from the server not to return the result to the client but actually return the result on the client the network cable is unplugged after the client gets the return value but before the client acknowleges getting the return value - last quiz, Q4-5 [virtual machines] Suppose a program is running in the virtual machine and: A guest program executes a system call to read input from the keyboard guest program is executing the same code it does outside a VM --> system call instruction --> exception for a system call on the real HW --> hypervisor needs to run the guest OS's system call handler b/c that's what would've happened on the real HW B guest OS, while handling this system call, switches to another program and runs it in user mode. guest OS needs to use an instruction to change what program is running (change the guest page table) and to change to user mode [probably a protection fault for each of these operations] both of these can't be done by code running in user mode --> they'll result in a protection fault --> the hypervisor will emulate them (and actually update hypervisor data structures) (e.g. the hypervisor is tracking whta the guest OS thinks its page table pointer is separately from the real page table pointer, because they need to be different) not visible to guest OS --> no exception in guest OS (guest OS thinks the operation just works normally) C the hypervisor detects keyboard input. on the real HW, an I/O device probably caused interrupt to get the hypervisor to run and know that there's keyboard input but not yet visible to guest OS --> no exception in guest OS D As a result, it decides to simulate an interrupt for the keyboard. now there's an exception in guest OS E the guess OS reads a control register from the keyboard controller we aren't going to allow the guest OS to access the keyboard directly --> protection fault on the real HW --> hypervisor will emulate accessing the keyboard controller F the guest OS switches the original program and runs it in user-mode. (same as B) - quiz 13 Q 2: NFSv2 server (stateless) without caching: open file + read 1024K in 128K chunks client steps: LOOKUP to get ID of file ~ b/c we need the ID of the file for future RPC calls (like READ) ~ b/c we need to tell the program whether or no the open fails (e.g. No such file or directory) READ 8 times (once for each chunk) ~ can't do less because we don't cache anything between reads the program does the ID ~~ usually something like (inode #) ~~ server doesn't need to know when we're done with the ID