So far, we have described how a stream of data is broken up into
packets, sent to another computer, and put back together. However
something more is needed in order to accomplish anything useful.
There has to be a way for you to open a connection to a specified
computer, log into it, tell it what file you want, and control the
transmission of the file. (If you have a different application in
mind, e.g. computer mail, some analogous protocol is needed.) This is
done by
Before going into more details about applications programs, we have to
describe how you find an application. Suppose you want to send a file
to a computer whose Internet address is 128.6.4.7. To start the
process, you need more than just the Internet address. You have to
connect to the file transfer server at the other end. In general,
network programs are specialized for a specific set of tasks. Most
systems have separate programs to handle file transfers, remote
terminal logins, mail, etc. When you connect to 128.6.4.7, you have
to specify that you want to talk to the file transfer program. This
is done by having
Note that a connection is actually described by a set of 4 numbers: the Internet address at each end, and the TCP port number at each end. Every packet has all four of those numbers in it. (The Internet addresses are in the IP header, and the TCP port numbers are in the TCP header.) In order to keep things straight, no two connections can have the same set of numbers. However it is enough for any one number to be different. For example, it is perfectly possible for two different users on a machine to be sending files to the same other machine. This could result in connections with the following parameters:
Internet addresses TCP ports connection 1 128.6.4.194, 128.6.4.7 1234, 21 connection 2 128.6.4.194, 128.6.4.7 1235, 21
Since the same machines are involved, the Internet addresses are the same. Since they are both doing file transfers, one end of the connection involves the well-known port number for file transfer. The only thing that differs is the port number for the program that the users are running. That's enough of a difference. Generally, at least one end of the connection asks the network software to assign it a port number that is guaranteed to be unique. Normally, it's the user's end, since the server has to use a well-known number.
Now that we know how to open connections, let's get back to the applications programs. As mentioned above, once TCP has opened a connection, we have something that might as well be a simple wire. All the hard parts are handled by TCP and IP. However we still need some agreement as to what we send over this connection. In effect this is simply an agreement on what set of commands the application will understand, and the format in which they are to be sent. Generally, what is sent is a combination of commands and data. They use context to differentiate. For example, the mail protocol works like this: Your mail program opens a connection to the mail server at the other end. Your program gives it your machine's name, the sender of the message, and the recipients you want it sent to. It then sends a command saying that it is starting the message. At that point, the other end stops treating what it sees as commands, and starts accepting the message. Your end then starts sending the text of the message. At the end of the message, a special mark is sent (a dot in the first column). After that, both ends understand that your program is again sending commands. This is the simplest way to do things, and the one that most applications use.
File transfer is somewhat more complex. The file transfer protocol
involves two different connections. It starts out just like mail.
The user's program sends commands like
Remote terminal connections use another mechanism still. For remote logins, there is just one connection. It normally sends data. When it is necessary to send a command (e.g. to set the terminal type or to change some mode), a special character is used to indicate that the next character is a command. If the user happens to type that special character as data, two of them are sent.
We are not going to describe the application protocols in detail in
this document. It's better to read the RFC's yourself. However there
are a couple of common conventions used by applications that will be
described here. First, the common network representation: TCP/IP is
intended to be usable on any computer. Unfortunately, not all
computers agree on how data is represented. There are differences in
character codes (ASCII vs. EBCDIC), in end of line conventions
(carriage return, line feed, or a representation using counts), and in
whether terminals expect characters to be sent individually or a line
at a time. In order to allow computers of different kinds to
communicate, each applications protocol defines a standard
representation. Note that TCP and IP do not care about the
representation. TCP simply sends octets. However the programs at
both ends have to agree on how the octets are to be interpreted. The
RFC for each application specifies the standard representation for
that application. Normally it is
(For more details about the protocols mentioned in this section, see rfc821.txt and rfc822.txt for mail, rfc959.txt for file transfer, and rfc854.txt and rfc855.txt for remote logins. For the well-known port numbers, see the current edition of Assigned Numbers, and possible rfc814.txt.)