cs398 Lecture Notes Spring 2000 Week 11, Tuesday For today you should have read Sections 5.1 and 5.2 For Thursday you should read Sections 5.3 lightly and 5.4 in detail. Also, you should write answers to questions 1, 8, 9 and 11 End-to-end protocols 1) Transport protocols, UDP and TCP 2) RPC - remote procedure call, request/reply (What are some examples of end-to-end services?) (I think this chapter is guilty of some bluster) UDP --- User Datagram Protocol (user is responsible for everything) Connectionless -- no setup, no breakdown, jsut send packets Unreliable -- if it doesn't get there you (probably) get an ICMP, but it is up to you (the application) to resend if you want to Process to process -- you are not talking to the recipient host; you are talking to a particular process on that host. How to identify the recipient process? 1) process ID: why not? 2) port number: these are abstract, arbitrary numbers Any process on a host can "claim" any port number. Certain numbers are supposed to be attached to specific applications. These are "well-known" port numbers. If an application has a well-known number, then clients from other machines know where to find it without having to ask. Most common UNIX applications have well-known numbers, but the mechanism for creating new ones is bad. One alternative is to run an application that provides a lookup service from names (strings) to port number. RESOURCE BROKER Application chooses arbitrary, available port, and notifies local ORB. Remote applications connect to RB to get the port number, then connect directly to requested service. All of this applies to TCP, too, which uses the same port space. (didn't have to be that way, right?) (How can you tell that an incoming UDP packet is a UDP packet?) TCP --- Transmission Control Protocol What's being "controlled"? 1) flow control (don't overrun the buffer capacity at the receiver (What's congestion control?) 2) reliability -- transparent resend, reordering, duplicate elimination 3) arbitrary-length messages (actually, a byte stream) What's different in this environment from point-to-point links? 1) not always talking to the same computer; need explicit setup phase (really?) 2) large variability in RTT, over connections and time (so what?) 3) delay-bandwidth product varies a lot from connection to connection and time to time (so what?) 4) reordering is possible (remember that this was a weirdness when we talked about SWS in Chapter 2) (need to deal with the possibility of long-lost packets wandering in from the desert) 5) in p-to-p, you can tell when you are swamping the link over an internet, you might be killing something in the middle without realizing it (congestion control -- Chapter 6) Why not do sliding window at hop-to-hop level, rather than end-to-end? pro: now flow control and congestion control are the same thing con: in heterogeneous network, you don't want to depend on every router and every link being correct If you can't depend on hop-by-hop, better do end-to-end, unless there is a compelling performance advantage. (for example?) Segmentation ------------ TCP is a byte-stream protocol. Sender writes bytes in arbitrary chunks. Receiver reads in arbitrary chunks. Bytes are buffered at sender until transmitted. Buffered at receiver until read by the application. If reader empties the buffer, reader blocks until more data arrive. Data can be transmitted in segments that have nothing to do with the chunks that are written by the sender or read by the reciever. When are segments sent? 1) when the sender accumulates one MTU worth 2) when the sender pushes (flushes) 3) periodically Segment headers --------------- Ports just like UDP Sequence number (32 bit) Acknowledgement field allows us to piggyback ACKS along with return data! Ackowledgement, SequenceNum and AdvertisedWindow used for flow control Additional flags Checksum (just like UDP) Connection ---------- Passive open: socket listen (anyone) Active open: socket connect (name other party) Three way handshake 1) SYN (starting sequence number) 2) SYN (starting sequence number), ACK (next expected number) 3) ACK (next expected number) Why not start with sequence number 0? State transition diagrams are an important tool for software design. Study this one carefully. Work through the examples. See how many bugs you would not have thought of if you didn't use this diagram. (paragraph on the top of page 383) Sliding Window -------------- Sender: last byte acked, last byte sent, last byte written Receiver: last byte read, next byte expected, last byte received Receiver throttles the sender by advertising a window that is no larger than the amount of data it has buffer space for. Sender is allowed to send Advertised window - unacknowledged bytes in flight When receiver advertises 0 window length, sender must stop. But only ACKs contain window advertisements. (So how do we get started again?) (What is the underlying design principle here?) (Can anyone think of a reason for it?) Wraparound ---------- Have to make sure we don't send 4 billion segments in less than 120 seconds (what's 120 seconds?) OC12 can easily do this. Ouch! (Are they neglecting headers?) Anyway, we can fix the problem by using the timestamp field as additional sequence number high-order bits. One of the problems for next time addresses this. Advertised window ----------------- 16-bit advertised window only allows 64 KB in flight. Can't keep even moderate pipes full! Transcontinental T3 has delay-bandwidth = 549 KB (What's the solution?) (What else have we seen that looks like this?) Adaptive retransmission ----------------------- Why is it important to get this right? 1) Delays indicate contention. If you retranmit whenever there is a delay, you are contributing unnecessary additional load at the worst time. 2) Timeouts are used for congestion control (Chapter 6). Important not to overreact. How to choose retranmit time adaptively 1) running average of RTT newAvg = alpha * oldAvg + (1 - alpha) * sampleRTT This is a standard technique for running averages. alpha -> 0 means use the more recent sample alpha -> 1 means we ignore new data Something like 0.825 works well. Why? Timeout = 2 * newAvg (What's the factor of 2 for?) 2) Karn/Partidge Apparently you can get you name on something just by suggesting that when you have to do a retransmit, you don't count it as a sample. 3) Jacobson/Karels Deviation matters! If all measured RTTs are the same, then there is no reason for the arbitrary factor of 2. If the deviation is high, you might want to wait extra long. Estimate cumulative deviation along with cumulative average. (There are lots of hackish ways to do this approximately) timeout = average + phi * deviation phi is something like 4 Record boundaries ----------------- Apparently TCP supports some ways of telling the receiver where the breaks are in the data, but I don't think anyone uses them because they are not provided by the API, at least not in a very friendly way. Besides, its easy to put this info into the data stream itself. The next message is 25 bytes long. Here is the next message. The next message is 8 bytes long. Bite me. Taxonomy of end-to-end protocols -------------------------------- byte-stream vs. request-reply reliable vs. unreliable What are the cons of byte-stream? 1) setup-breakdown overhead (matters more for short transactions) this topic will be back when we talk about HTTP 2) no record boundaries (but we just said that it's easy to do this at the application level -- in the proxy server you wrote, you used newlines to demarcate "lines") What are the advantages? 1) no upper bound on message size (do we buy this? -- see page 396) 2) setup gives the reciever a chance to reject before the sender sends data (do you get the sense this section is really desperate?) Rate control versus window control -- more foreshadowing of Chapter 6.