The primary challenge TCP addresses is the nature of the underlying network layer protocol, IP. To understand TCP's significance, one must first understand the limitations it is designed to overcome.
1.1. The "Connectionless" Nature of IP
The Internet Protocol (IP) is a connectionless service. Its sole duty is to make a best-effort attempt to deliver packets from a source network layer to a destination network layer.
• No Guarantees: IP does not care about the data within the packets it forwards. It offers no guarantee of delivery, no confirmation of receipt, and no mechanism to ensure packets arrive in the order they were sent.
• Packet Loss: Network routers may drop packets if they experience congestion (e.g., CPU utilization at 100%). IP has no built-in mechanism to detect or recover from this loss.
• The "I Don't Love You" Analogy: A powerful illustration of this problem involves a message from a character named Anjeli to another named Rahul.
◦ Message: "I don't love you"
◦ Segmentation: The message is broken into four packets: [I], [don't], [love], [you].
◦ Packet Loss Scenario: A congested router on the IP network drops the [don't] packet.
◦ Result: Rahul receives the packets [I], [love], [you] and the application layer reconstructs the message as "I love you." The meaning is completely inverted due to the loss of a single packet, highlighting the critical need for a reliability mechanism for certain applications.
1.2. The Role of UDP
User Datagram Protocol (UDP), another transport layer protocol, operates similarly to IP in that it does not provide reliability. It takes data from the application, adds a small header, and sends it without establishing a connection or tracking delivery. This makes it suitable for real-time applications (e.g., streaming video, online gaming) where speed is paramount and the occasional loss of a packet is acceptable.
2. The TCP Solution: The Meaning of a "Connection"
TCP is described as a "connection-oriented" protocol. This "connection" is not a physical wire stretched between two endpoints. Instead, it is a logical state established through an initial exchange of messages before any application data is sent.
• Establishing a Logical State: The connection is the result of a preliminary conversation (the handshake) where the two communicating parties agree on the parameters of their session.
• Key Goals of the Connection:
1. Confirmation of Presence: Both parties confirm that a responsive entity exists at the remote end.
2. Synchronization of Sequence Numbers: Both parties agree on the starting sequence number for their respective data streams. This is the foundation for tracking and ordering all subsequent data.
This process of "thinking before sending" ensures that a stable, understood context for communication exists before risking the transmission of valuable application data.
3. Core Mechanisms of TCP Reliability
TCP achieves reliability, ordering, and error-checking through two primary, interrelated mechanisms: sequence numbers and acknowledgements.
3.1. Sequence Numbers
The intuitive solution to the packet loss problem is to number each packet. If the receiver gets packets 1, 3, and 4, it knows that packet 2 is missing and can wait for it before passing the complete data to the application. TCP implements a more granular version of this concept.
• Byte-Level Sequencing: TCP assigns a unique sequence number to every byte of data in a stream. It does not number the segments themselves.
• Segment Sequence Number: The sequence number field in a TCP segment's header refers to the sequence number of the first byte of data contained within that segment.
• Calculating the Next Sequence Number: The sequence number for the next segment in a stream can be calculated with a simple formula: Next Sequence Number = Current Sequence Number + Length of Data in Current Segment For example, if a segment starts with sequence number 215 and contains 100 bytes of data, the next segment will start with sequence number 315 (215 + 100).
3.2. Acknowledgement (ACK) Numbers
Acknowledgements are the feedback mechanism that makes TCP a closed-loop system. The receiver uses the acknowledgement number to inform the sender what data has been successfully received.
• Cumulative Acknowledgement: The ACK number is cumulative. It signifies the next byte the receiver expects to receive.
• Meaning of the ACK Number: An acknowledgement number of 101 explicitly means: "I have successfully and correctly received all bytes up to and including byte 100. Please send me the data beginning with byte 101."
• Confirmation Loop: When a sender receives this ACK, it knows that its segment containing bytes 1-100 was successfully delivered and can proceed to send the next segment starting with byte 101.
4. Establishing the Connection: The TCP Handshake
The TCP handshake is the process of establishing the logical connection. While it can be understood as a four-step logical process, it is optimized in practice into a Three-Way Handshake.
4.1. TCP Header Flags
The handshake process is managed by setting specific 1-bit flags in the TCP header. The two most important for connection establishment are:
• SYN (Synchronize): A flag indicating a request to initiate a connection and synchronize sequence numbers.
• ACK (Acknowledge): A flag indicating that the acknowledgement number field in the header is significant and contains a valid number.
4.2. From Four-Way Logic to Three-Way Efficiency
To build the concept from first principles, the process can be viewed as four distinct messages:
1. Anjeli -> Rahul (SYN): "I want to communicate. My initial sequence number (ISN) is X."
2. Rahul -> Anjeli (ACK): "I acknowledge your request. I am expecting your next byte to be X+1."
3. Rahul -> Anjeli (SYN): "I also want to communicate. My initial sequence number (ISN) is Y."
4. Anjeli -> Rahul (ACK): "I acknowledge your request. I am expecting your next byte to be Y+1."
For efficiency, TCP combines steps 2 and 3 into a single message, resulting in the well-known Three-Way Handshake.
4.3. The Three-Way Handshake Process
Step | Direction | Packet Description | Flags Set | Key Information |
1 | Client -> Server | Synchronize (SYN) | SYN
| The client initiates the connection, sending its Initial Sequence Number (e.g., Seq=0). No application data is sent (Length=0). |
2 | Server -> Client | Synchronize-Acknowledge (SYN-ACK) | SYN, ACK
| The server acknowledges the client's SYN (Ack=1) and sends its own Initial Sequence Number (Seq=0). No application data is sent (Length=0). |
3 | Client -> Server | Acknowledge (ACK) | ACK
| The client acknowledges the server's SYN (Ack=1). The connection is now established, and data transfer can begin. |
Note: In practice, Initial Sequence Numbers are large, random numbers. Wireshark often displays them as relative numbers (starting from 0) to simplify analysis.
5. Data Segmentation and Sizing
TCP must break the continuous stream of data from an application into smaller pieces called segments before sending them to the IP layer. The size of these segments is carefully determined to optimize network performance.
• Maximum Transmission Unit (MTU): The underlying data link layer (e.g., Ethernet) has a maximum frame size, typically 1500 bytes. This is the MTU.
• Maximum Segment Size (MSS): To avoid inefficient IP-level fragmentation, the transport layer calculates the maximum amount of application data it can fit into one segment. This is the MSS.
• MSS Calculation: The MSS is determined by subtracting the size of the IP and TCP headers from the MTU.
◦ MSS = MTU - IP Header Size - TCP Header Size
◦ Example: 1460 bytes = 1500 bytes (MTU) - 20 bytes (IP Header) - 20 bytes (TCP Header)
• By creating segments with a data payload no larger than the MSS, TCP ensures that the resulting IP packets will not exceed the network's MTU.
6. Application and Port Management
On any given host (client or server), multiple applications may need to use the network simultaneously. TCP uses port numbers to manage these communications.
• Port Numbers: A port is a 16-bit number that identifies a specific application or process on a host. For example, BGP uses port 179, and HTTPS uses port 443.
• Socket: The combination of an IP address and a port number (e.g., 217.21.19.11:443) creates a unique endpoint for communication, known as a socket.
• Demultiplexing: When a TCP segment arrives at a host, the operating system uses the destination port number in the TCP header to deliver the data to the correct application (e.g., the web server process) and not any other running application.