- general format of questions - likely similar to quizzes but with: - some cases where you might fill in a table - possibly some short written answers (<1 sentence) - unlike quizzes, probably will try to have quesitons that deal with topics from multiple parts of the semester - information that will/will not be provided on the final - I don't expect you to memorize - socket function names, argument format (but we might want you to know there's a functoin to open a connection, to create a socket) - p4 syntax - probably zero-few socket/p4 programming quesitons - for code, if it's obvious what function/arguments were intended we don't care about syntax - if you need informatoin about the format/size of headers, we will provide it - congestion control calculations - key idea in "steady state" (not in beginning-of-connection) additive increase PER RTT (1/RTT factor increase per packet) multiplicative decrease PER LOSS EVENT --> yields "fair" sharing between connections (if additive amount is the same) intuition: increase slowly to avoid overload decrease quickly to react to overload (b/c congestion collapse is so bad) - in beginning of connection want to rapidly discover the window size TCP strategy: "slow start" -- multiplicative increase PER RTT (constant factor increase per packet) - optimizations: - temporarily adjusting the window size for missing packets intuition: normally one packet ACK'd = one less packet in network if we want to maintain N packets in network, we should move window edge and send a new packet BUT: on a loss without SACK, dup ACKs = one less packet in network (but we normally can't advance the window b/c we'd need to change the window size) (adjustment after multiplicative decrease) want: window size = number of packets in the network - distinguishing between timeouts and "normal" losses detected with dupacks many versions of TCP reset to slow-start on timeout - normally if our window size is kinda large, most of the packets should get through and we should 3 DUPACKs - this means that if the retransmit on DUPACK heuristic doesn't work than probably the network is very congested but do multiplicative decrease on dupack - variations: - delay-based congestion control (look at RTT changing instead of losses) - adjusting increase/decrease factors to not be additive or multipicative (e.g. CUBIC) - using selectative ACK or ECN information - routing tables --- switch v router - "routing table" IP address range --> where to send (or other net address) in IP: where to send is: which output interface + "gateway"/next router (if any) router C that connects 10.1/16 and 10.2/16 and 10.3/16 and 10.3 connects to the everything else 10.1/16 | interface 0, no gateway [just use the local netwokr way of finding machines] 10.2/16 | interface 1 10.3/16 | interface 2 default | interface 2, gateway 10.3.0.2 (another router) gateway is needed if the ultimate destinatoin isn't on the directly connected network - routing table is a network layer thing [the layer that's about connecting networks together] it relies on the link layer understanding how to get frames to specific machines on the networks connected to th einterfaces - router: acting at IP or similar layer and connecting different networks together - USUALLY implemented with RIP-like (distance vector) or OSPF-like (link state) protocol for big networks (for simple networks, we can statically configure; e.g. home routers) - switch: acting at Ethernet/Wifi or similar layer and implementing the local network - typically Ehternet/Wifi layer has a notion of "broadcast to all" that IP layer doesn't - typically these work with hardware (MAC) addresses, not glboally usable IP addresses - USUALLY implemented with MAC learning-like ideas and spanning-tree-like protocol for big local networks - we can use IP-like techniques with MAC addresses (and maybe some datacenters do this...) so aside from which types of addresses are used and how "big" the network tends to be it doesn't say that much about implementation techniques - DV - distance vector for a router X is a list of the networks (or machines) X can reach and its metric ("cost") to reach them - with DV-based routing, routers share distance vectors with their neighbors and use this information to update their own distance vectors example: if router X discovers neighbor Y can reach network A with cost 5 and router X has a cost 3 link to Y, then router X knows it can reach A with cost 8 via Y if we keep sharing distance vectors and making updates, we'll eventually have full information about what we can reach with the lowest metric and we'll track which router to go to and then we can use this to create a routing table - tricky part: updates that incresae cost if router X thinks it can reach A with cost 8 via Y and router Y says its distance to A is now 100, router X needs to update its distance vector (and should look for alternative ways to get to A) "count to infinity" --- when something becomes unreachable nodes won't instantly update their cost to infinity they'll find a "phantom" alternate route because not everyone has realized that a network is unreachable Then, we'll some "loop" of paths that try to use this phantom route until cost goes to infinity - mitigation: limit the maximum cost or use a link-state protocol - versus link-state: - routers learn about "whole" network, not just what their neighbors distances are - RED random early detection - we'd like to signal that a switch/router's queue is filling up - without something like ECN, the only way to send this signal is dropping a packet - RED solution: IMPLEMENTED on the ROUTER/SWITCH drop a packet with some probability when queue is filling up, but not full goal: a TCP connection will see some packet loss and slow down before queues fill up and we have to drpo a ton of packets - particularly useful because we want to have large queues to minimize "congestoin collapse"-like problems (to have more leeway for congestion) but we don't want the norm to have the high latencies that large queues gwould give us - encapsulation, types table on slide - encapsulation: carrying one network "connection" in another - encapsulatoin X in Y X or Y can be: high-layel HTTP messages, DNS messages, ... TCP streams or series of UDP datagrams IP packets Ethernet packets low-level - why to choose one combinatoin over the other - how easy is it to "capture" the network trafffic e.g. for TCP streams, we can make a few small changes to code useing sockets e.g. for IP packets, we can have our OS support a virtual IP device and edit the routing table w/o modifying programs at all - what manipulation of the traffic can we easily do e.g. for HTTP messages, it's much easier to add caching/compression/firewall-type filtering/etc. e.g. for IP/Ethernet, it's much easier to make the machine appear to be at a different IP to applciations - overhead e.g. for TCP streams or HTTP, we're implementing these higher layers "twice" (probably higher latency, but maybe "opitmization" opportuntiy) e.g. for IP/Ethernet in Y: we're sending multiple IP and/or ethernet headers when maybe choosing a different layer would avoid this - effect on maximum transmission unit size - ... - why at all - classic VPN scenario --- act as if on network A but reall yconected to network B example: network A = UVA, network B = some hotel somewhere - classic VLAN scenario --- pretend to have a separate local network for phones versus other devices - "tunnel" between two company sites so the companies internal network seems to span multiple office buildings with no diret connection - ISP selling a "virtal ethernet" connection between two distance sites in a city - Tor --- privacy: hide origin of (web) traffic - to make it so a firewall can process traffic that it's not on the "normal" path for - .... - wireless --- RTS/CTS v carrier sense - ready-to-send/clear-to-send primarily deals with the problem of hidden nodes we'll send short messages in BOTH directions, so it's likely any node that would interfere will see that the channel is reserved - pro: - we typically won't get interference from hidden nodes who think there's no transmission to interfere with (except if things are "exactly" badly timed, which would've been a problem either way) - con: - we have this extra exchange before sending data, which takes time we could have used to send some of the packet probably good to use if: - our messages are larger (so that extra exchange is not much of th etime) - we expect a lot ofhidden node problems