I keep seeing requests on various newsgroups for an introduction to TCP/IP. I also get such requests locally. I believe that the only appropriate description of TCP/IP is the RFC's. However I also think a brief introduction is likely to be helpful before plowing right into them. The following document is an attempt to do that. It also recommends some RFC's to look at and tells you how to get them.
This document is a brief introduction to TCP/IP, followed by advice on what to read for more information. This is not intended to be a complete description, but merely enough of an introduction to allow you to start reading the RFC's. At the end of the document there will be a list of the RFC's that we recommend reading.
TCP/IP is a set of protocols developed to allow cooperating computers to share resources across a network. It was developed by a community of researchers centered around the ARPAnet. Certainly the ARPAnet is the best-known TCP/IP network. However as of June, 87, at least 130 different vendors had products that support TCP/IP, and thousands of networks of all kinds use it.
First some basic definitions. Although TCP/IP (or IP/TCP) seems to be
the most common term these days, most of the documentation refers to
Whatever it is called, TCP/IP is a family of protocols. A few are
basic ones used for many applications. These include IP, TCP, and
UDP. Others are protocols for doing specific tasks, e.g. transferring
files between computers, sending mail, or finding out who is logged in
on another computer. Any real application will use several of these
protocols. A typical situation is sending mail. First, there is a
protocol for mail. This defines a set of commands which one machine
sends to another, e.g. commands to specify who the sender of the
message is, who it is being sent to, and then the text of the message.
However this protocol assumes that there is a way to communicate
reliably between the two computers. Mail, like other application
protocols, simply defines a set of commands and messages to be sent.
It is designed to be used together with TCP and IP. TCP is responsible
for making sure that the commands get through to the other end. It
keeps track of what is sent, and retransmitts anything that did not
get through. If any message is too large for one packet, e.g. the
text of the mail, TCP will split it up into several packets, and make
sure that they all arrive correctly. Since these functions are needed
for many applications, they are put together into a separate protocol,
rather than being part of the specifications for sending mail. You
can think of TCP as forming a library of routines that applications
can use when they need reliable network communications with another
computer. Similarly, TCP calls on the services of IP. Although the
services that TCP supplies are needed by many applications, there are
still some kinds of applications that don't need them. However there
are some services that every application needs. So these services are
put together into IP. As with TCP, you can think of IP as a library
of routines that TCP calls on, but which is also available to
applications that don't use TCP. This strategy of building several
levels of protocol is called
TCP/IP is based on the
Of course we normally refer to systems by name, rather than by Internet address. When we specify a name, the network software looks it up in a database, and comes up with the corresponding Internet address. Most of the network software deals strictly in terms of the address. (rfc-882.txt describes the database used to look up names.)
TCP/IP is a
It may have occured to you that something is missing here. We have
talked about Internet addresses, but not about how you keep track of
multiple connections to a given system. Clearly it isn't enough to
get a packet to the right destination. TCP has to know which
connection this packet is part of. This task is referred to as
We start with a single data stream, say a file you are trying to send to some other computer:
TCP breaks it up into managable chunks. (In order to do this, TCP has to know how large a packet your network can handle. Actually, the TCP's at each end say how big a packet they can handle, and then they pick the smallest size.)
.... .... .... .... .... .... .... ....
TCP puts a header at the front of each packet. This header actually
contains at least 20 octets, but the most important ones are a source
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port | Destination Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | various other junk | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | various other junk | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Checksum | other junk | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | your data ... next 500 octets | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ...... |
If we abbreviate the TCP header as
T.... T.... T.... T.... T.... T.... T.... T....
TCP now sends each of these packets to IP. Of course it has to tell IP the Internet address of the computer at the other end. Note that this is all IP is concerned about. It doesn't care about what is in the packet, or even in the TCP header. IP's job is simply to find a route for the packet and get it to the other end. In order to allow gateways or other intermediate systems to forward the packet, it adds its own header. The main things in this header are the source and destination Internet address (32-bit addresses, like 188.8.131.52), the protocol number, and another checksum. The source Internet address is simply the address of your machine. (This is necessary so the other end knows where the packet came from.) The destination Internet address is the address of the other machine. (This is necessary so any gateways in the middle know where you want the packet to go.) The protocol number tells IP at the other end to send the packet to TCP. Although most IP traffic uses TCP, there are other protocols that can use IP, so you have to tell IP which protocol to send the packet to. Finally, the checksum allows IP at the other end to verify that the packet wasn't damaged in transit. Note that TCP and IP have separate checksums. This is because IP doesn't know anything about TCP. As far as IP is concerned, everything after its header is just a bunch of bits. So IP computes a checksum of its own header, and IP at the other end checks it to make sure that the message didn't get damaged in transit. Once IP has tacked on its header, here's what the message looks like:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | various other junk | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | various other junk | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | junk | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TCP header, then your data ......
If we represent the IP header by an
IT.... IT.... IT.... IT.... IT.... IT.... IT.... IT....
At this point, it's possible that no more headers are needed. If your computer happens to have a direct phone line connecting it to the destination computer, or to a gateway, it may simply send the packets out on the line (though likely a synchronous protocol such as HDLC would be used, and it would add at least a few octets at the beginning and end).
However most of our networks these days use Ethernet. So now we have
to describe Ethernet's headers. Unfortunately, Ethernet has its own
addresses. The people who designed Ethernet wanted to make sure that
no two machines would end up with the same Ethernet address.
Furthermore, they didn't want the user to have to worry about
assigning addresses. So each Ethernet controller comes with an
address builtin from the factory. In order to make sure that they
would never have to reuse addresses, the Ethernet designers allocated
48 bits for the Ethernet address. People who make Ethernet equipment
have to register with a central authority, to make sure that the
numbers they assign don't overlap any other manufacturer. Ethernet is
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ethernet destination address (first 32 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ethernet dest (last 16 bits) |Ethernet source (first 16 bits)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ethernet source address (last 32 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type code | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IP header, then TCP header, then your data | | | ... | | | end of your data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ethernet Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
If we represent the Ethernet header with
EIT....C EIT....C EIT....C EIT....C EIT....C EIT....C
When these packets are received by the other end, of course all the headers are removed. The Ethernet interface removes the Ethernet header and the checksum. It looks at the type code. Since the type code is the one assigned to IP, the Ethernet device driver passes the packet up to IP. IP removes the IP header. It looks at the IP protocol field. Since the protocol type is TCP, it passes the packet up to TCP. TCP now looks at the packet sequence number. It uses the sequence numbers and other information to combine all the packets into the original file.
The ends our initial summary of TCP/IP. There are still some crucial concepts we haven't gotten to, so we'll now go back and add details in several areas. (For detailed descriptions of the items discussed here see, rfc793.txt for TCP, rfc791.txt for IP, and rfc894.txt and rfc826.txt for sending IP over Ethernet.)