cs398 Lecture Notes
Spring 2000
Week 11, Tuesday

For today you should have read Sections 5.1 and 5.2

For Thursday you should read Sections 5.3 lightly
and 5.4 in detail.

Also, you should write answers to questions 1, 8, 9 and 11


End-to-end protocols

1) Transport protocols, UDP and TCP

2) RPC - remote procedure call, request/reply

   (What are some examples of end-to-end services?)

   (I think this chapter is guilty of some bluster)


UDP
---

User Datagram Protocol (user is responsible for everything)

Connectionless -- no setup, no breakdown, jsut send packets

Unreliable -- if it doesn't get there you (probably) get an
	      ICMP, but it is up to you (the application)
	      to resend if you want to

Process to process -- you are not talking to the recipient
		      host; you are talking to a particular
		      process on that host.

How to identify the recipient process?

1) process ID:   why not?

2) port number:  these are abstract, arbitrary numbers

Any process on a host can "claim" any port number.

Certain numbers are supposed to be attached to specific applications.

These are "well-known" port numbers.

If an application has a well-known number, then clients
from other machines know where to find it without having
to ask.

Most common UNIX applications have well-known numbers, but
the mechanism for creating new ones is bad.

One alternative is to run an application that provides
a lookup service from names (strings) to port number.

  RESOURCE BROKER

Application chooses arbitrary, available port, and notifies
local ORB.

Remote applications connect to RB to get the port number, then
connect directly to requested service.

All of this applies to TCP, too, which uses the same port
space.

	(didn't have to be that way, right?)

(How can you tell that an incoming UDP packet is a UDP packet?)


TCP
---

Transmission Control Protocol

What's being "controlled"?

1) flow control (don't overrun the buffer capacity at the
   receiver

   (What's congestion control?)

2) reliability --

   transparent resend, reordering, duplicate elimination

3) arbitrary-length messages (actually, a byte stream)


What's different in this environment from point-to-point links?

1) not always talking to the same computer; need explicit
   setup phase

   (really?)

2) large variability in RTT, over connections and time

   (so what?)

3) delay-bandwidth product varies a lot from connection to
   connection and time to time

   (so what?)

4) reordering is possible (remember that this was a weirdness
   when we talked about SWS in Chapter 2)

   (need to deal with the possibility of long-lost packets
   wandering in from the desert)

5) in p-to-p, you can tell when you are swamping the link

   over an internet, you might be killing something in the
   middle without realizing it

   (congestion control -- Chapter 6)


Why not do sliding window at hop-to-hop level, rather than
end-to-end?

pro: now flow control and congestion control are the same
thing

con: in heterogeneous network, you don't want to depend on
every router and every link being correct

If you can't depend on hop-by-hop, better do end-to-end,
unless there is a compelling performance advantage.

       (for example?)


Segmentation
------------

TCP is a byte-stream protocol.

Sender writes bytes in arbitrary chunks.

Receiver reads in arbitrary chunks.

Bytes are buffered at sender until transmitted.

Buffered at receiver until read by the application.

If reader empties the buffer, reader blocks until more
data arrive.

Data can be transmitted in segments that have nothing
to do with the chunks that are written by the sender or
read by the reciever.

When are segments sent?

1) when the sender accumulates one MTU worth

2) when the sender pushes (flushes)

3) periodically


Segment headers
---------------

Ports just like UDP

Sequence number (32 bit)

Acknowledgement field allows us to piggyback ACKS along with
return data!

Ackowledgement, SequenceNum and AdvertisedWindow used for flow control

Additional flags

Checksum (just like UDP)


Connection
----------

Passive open: socket listen  (anyone)

Active open: socket connect  (name other party)

Three way handshake

1) SYN (starting sequence number)

2) SYN (starting sequence number), ACK (next expected number)

3) ACK (next expected number)

   Why not start with sequence number 0?

State transition diagrams are an important tool for software
design.  Study this one carefully.  Work through the examples.

See how many bugs you would not have thought of if you didn't
use this diagram.

    (paragraph on the top of page 383)


Sliding Window
--------------

Sender: last byte acked, last byte sent, last byte written

Receiver: last byte read, next byte expected, last byte received

Receiver throttles the sender by advertising a window that is
no larger than the amount of data it has buffer space for.

Sender is allowed to send 

       Advertised window - unacknowledged bytes in flight

When receiver advertises 0 window length, sender must stop.

But only ACKs contain window advertisements.

    (So how do we get started again?)

    (What is the underlying design principle here?)

    (Can anyone think of a reason for it?)


Wraparound
----------

Have to make sure we don't send 4 billion segments in less
than 120 seconds

     (what's 120 seconds?)

OC12 can easily do this.  Ouch!

     (Are they neglecting headers?)

Anyway, we can fix the problem by using the timestamp field
as additional sequence number high-order bits.

One of the problems for next time addresses this.


Advertised window
-----------------

16-bit advertised window only allows 64 KB in flight.

Can't keep even moderate pipes full!

Transcontinental T3 has delay-bandwidth = 549 KB

   (What's the solution?)

   (What else have we seen that looks like this?)


Adaptive retransmission
-----------------------

Why is it important to get this right?

1) Delays indicate contention.  If you retranmit whenever there
   is a delay, you are contributing unnecessary additional load
   at the worst time.

2) Timeouts are used for congestion control (Chapter 6).
   Important not to overreact.


How to choose retranmit time adaptively

1) running average of RTT

   newAvg = alpha * oldAvg + (1 - alpha) * sampleRTT

   This is a standard technique for running averages.

   alpha -> 0  means use the more recent sample 
   alpha -> 1  means we ignore new data

   Something like 0.825 works well.  Why?

   Timeout = 2 * newAvg

   (What's the factor of 2 for?)

2) Karn/Partidge

   Apparently you can get you name on something just by
   suggesting that when you have to do a retransmit, you
   don't count it as a sample.

3) Jacobson/Karels

   Deviation matters!

   If all measured RTTs are the same, then there is no
   reason for the arbitrary factor of 2.

   If the deviation is high, you might want to wait extra long.

   Estimate cumulative deviation along with cumulative average.

   (There are lots of hackish ways to do this approximately)

   timeout = average + phi * deviation

   phi is something like 4


Record boundaries
-----------------

Apparently TCP supports some ways of telling the receiver
where the breaks are in the data, but I don't think anyone
uses them because they are not provided by the API, at least
not in a very friendly way.

Besides, its easy to put this info into the data stream
itself.

The next message is 25 bytes long.
Here is the next message.
The next message is 8 bytes long.
Bite me.


Taxonomy of end-to-end protocols
--------------------------------

byte-stream vs. request-reply

reliable vs. unreliable

What are the cons of byte-stream?

1) setup-breakdown overhead  (matters more for short transactions)

   this topic will be back when we talk about HTTP

2) no record boundaries

   (but we just said that it's easy to do this at the application
   level -- in the proxy server you wrote, you used newlines to
   demarcate "lines")

What are the advantages?

1) no upper bound on message size

   (do we buy this? -- see page 396)

2) setup gives the reciever a chance to reject before the
   sender sends data

   (do you get the sense this section is really desperate?)

Rate control versus window control -- more foreshadowing of Chapter 6.