Reliable Delivery & Sequence Numbers

intermediate tcp reliability sequence-numbers ack retransmission

TCP gives us reliable, in-order, exactly-once delivery on top of an unreliable internet. The magic ingredients are sequence numbers, acknowledgements, and retransmissions.

In simple language: every byte gets numbered. The receiver tells the sender “I got up to byte X.” If something goes missing, the sender resends.

The Core Mechanism

  1. Sender numbers each byte with a sequence number.
  2. Receiver sends back acknowledgement (ACK) for the next byte expected.
  3. Sender keeps a timer for unacked data. If the timer fires, retransmit.
  4. Receiver buffers out-of-order data and reorders before delivering to the app.

That’s it. Every other reliability feature is built on these four ideas.

Sequence Numbers Are Per-Byte

Each byte in a TCP stream has its own sequence number. The header carries the seq# of the first byte in the segment.

Segment 1:  seq=1000, length=500   -> covers bytes 1000..1499
Segment 2:  seq=1500, length=500   -> covers bytes 1500..1999
Segment 3:  seq=2000, length=200   -> covers bytes 2000..2199

The receiver replies with ack=2200 once everything up to byte 2199 has arrived. ACKs are cumulativeack=2200 implicitly acknowledges everything before it.

A Lost Segment Example

Sender                       Receiver
  seq=1000, len=500   ──▶
  seq=1500, len=500   ──▶ (lost!)
  seq=2000, len=500   ──▶
                          ◀──  ack=1500   (got first segment)
                          ◀──  ack=1500   (got 3rd, but 2nd missing!)
                          ◀──  ack=1500   (still waiting)
  retransmit seq=1500 ──▶
                          ◀──  ack=2500   (now caught up)

Three duplicate ACKs is a classic signal — modern TCP triggers fast retransmit (don’t wait for the timer; resend immediately).

Retransmission Timeout (RTO)

The sender doesn’t pick the timeout randomly. It estimates the round-trip time (RTT) continuously and sets RTO based on it.

RTO ≈ smoothed_RTT + 4 * RTT_variance

If a network is fast and stable, RTO is tight. If RTT spikes, RTO grows. If we still don’t get an ACK, RTO doubles on each retry (exponential backoff).

SACK — Selective Acknowledgement

Cumulative ACKs are simple but wasteful — if segment 5 of 10 is lost, the sender might retransmit 5 through 10 even though only 5 was lost.

SACK (Selective ACK) lets the receiver say: “I got up to byte X, AND I separately have bytes Y..Z.” Sender only resends what’s actually missing. RFC 2018.

ack=1500, SACK=2000-2500    means "next expected = 1500, but I also have 2000..2500"

Most modern TCPs negotiate SACK during the handshake.

Detecting Duplicates

Because sequence numbers identify exact bytes, the receiver can tell if a segment is a duplicate (e.g. sender retransmitted but the original eventually arrived). Duplicates are silently dropped at the receiver.

Out-of-Order Delivery

The IP layer doesn’t guarantee order. A segment with seq=2000 might arrive before seq=1500. The receiver:

  1. Buffers the out-of-order segment.
  2. Sends a duplicate ACK for the byte it’s still waiting for.
  3. Once the gap fills, delivers everything to the app in order.

The app never sees out-of-order bytes. That’s the whole point.

Checksums — Detecting Corruption

Every TCP segment has a 16-bit checksum covering the header + payload. If a bit flips on the wire, the checksum fails and the receiver drops the segment without ACKing. The sender will retransmit.

The checksum is weak by modern standards (Ethernet’s CRC and TLS’s MAC are stronger), but it catches casual corruption.

Connection State Tracks All This

Each TCP connection keeps track of:

type TCPState = {
  sndUna: number   // smallest unacked byte
  sndNxt: number   // next byte to send
  rcvNxt: number   // next byte expected from peer
  rttSample: number
  rto: number
  sackBlocks: Array<[number, number]>
}

The OS kernel maintains all this for each socket — and that’s why TCP is “expensive” compared to UDP.

Common Gotcha

TCP is reliable, but not instant. A retransmission can add hundreds of milliseconds. Apps that need real-time delivery (voice, gaming) often pick UDP precisely to avoid TCP’s reliability mechanisms — fresh data is more important than complete data.

Interview Tip

A favorite question: “If TCP is reliable, why do real-time apps avoid it?” — head-of-line blocking. One lost packet stalls everything behind it because TCP must deliver in order. UDP/QUIC sidestep this by not enforcing ordering across all data.