read more
Notes on the insanely detailed TCP implementation
Each side of the connection has its own seq number and ack number. The receiving host will respond with the sending host’s seq+1 as its ACK, and its seq will be the ACK the sender sent.
You want to set the timeout timer to be longer than the round trip time, but how do you estimate the round trip time? Just doing a weighted moving average.
If the sender receives three (choice) ACKs that ACK up to the same point, fast retransmit means that the sender will ignore the timeout and just send the data after that point, because three ACKs are bound to mean that things got lost.
Every ACK packet from the receiver is going to have a “receiver window size”, and that tells the sender how much it can have on the wire at a given time non-ACKed, because otherwise the receiver could be dealing with too much application at once.
Just note that once you saturate a link, you’ll get some natural loss but when you have that loss that means you’re going to need to retransmit (which means you’re increasing the load on the link) and so you’ll have even more loss and so eventually even if you’re only sending half of link utilization, you’ll still end up with reduced performance due to loss which will bump up that to more than half and then put you in a bad cycle.
Paradigms of congestion control: end-to-end (smart end systems just infer from losses). Hop-by-hop: routers can set a bit in a packet’s header to indicate congestion.
To get a sense for how much the network can take, TCP will do a slow-start in which it starts with a small congestion window (small number of packets allowed unACKed) and wait for the ACKs for the packets, and once it gets all the ACKs (and thus doesn’t detect losses) it will double the window size, thus growing exponentially. Once it gets losses, it’ll do a cut-back because it assumes those losses are because of network congestion.
AIMD (Additive increase, multiplicative decrease): Slowstart until you either have loss or you hit some predefined threshold. Once you hit that threshold, just increase the window size by 1 for every ACK you get (so you’ll go over the threshold linearly). Once you hit a loss, reset the congestion window size to 1 (I guess to give the network some time to cool down?) and set the threshold to half whatever window size you had when the loss occurred. Then slowstart up until that new threshold. The threshold at any time essentially represents what we should expect to be okay utilization (so we can slow start to it) and then anything above it that we probe linearly is just making sure that we’re not underutilizing our link. But Fast recovery enabled in a newer version of TCP (TCP Reno) skips slow start after a loss and just does linear probing at the new threshold.
Congestion avoidance: the linear probing above the threshold? Why is it called that?
Because you have window sizes that limit the transfers, you could potentially be stalling when sending data which is not maximized utilization.