Chapter 3: Understanding HTTP

MIME


[ Comments ] [ Copyright ] [ Chapter contents ] [ Book contents ]


MIME has been mentioned too many times in the past few pages to ignore. Originally, internet e-mail was transmitted in 7-bit ASCII code; this is an historical accident, dating to the early 1960's, when minimizing the amount of data exchanged was a priority, and the need to transmit 8-bit data had not yet been recognized. (Some systems today still have difficulty handling 8-bit information streams.) However, this was not exactly a satisfactory solution. It rendered the transport of complex binary data, or even ordinary text containing accented characters, impossible. MIME is a standard for internet multimedia mail transport, specified in RFC1521 (for details of specifying the format of internet message bodies) and RFC 1522 (for details of a standard for transmitting non-ASCII text within MIME).

MIME messages look structurally similar to normal RFC 822 email messages, except for some additional header lines. Mime messages require the following additional header lines:
MIME-Version:The version of the MIME standard to which the message conforms (so that mail agents or web browsers know how to decode the message)
Content-Type:The type and subtype of data in the body of the message, and specifies how the data is encoded. This usually consists of two values separated by a slash "/".
Content-Transfer-Encoding:How the data is encoded to allow it to pass through mail gateways. (MIME usually uses a radix-64 encoding protocol to send binary files, but other encoding mechanisms are available.)

In addition, two other header fields may be used: Content-ID: and Content-Description:.

The point of MIME is simple; it provides a framework for sending canned multimedia information over the net. It describes the encoding and transmission of the body of a message, in a manner that isextensible. New content-types can be registered centrally with the Internet Assigned Numbers Authority (IANA@isi.edu). Thus, MIME was a natural choice for the World Wide Web.


Content-types

Common content-types include "text" (contains textual information in a variety of character sets), "multipart" (indicating that several different types of data are combined in one message), "application" (indicating that the message contains binary data specific to some application), "message" (for encapsulating other mail messages),"image" (for transmitting graphics files), "audio" (for transmitting audio or voice data), and "video" (for transmitting video or moving image data).

Such formats are usually specified as type/subtype; for example: text/ascii, image/gif, or text/richtext. (Richtext, defined in RFC 1341,looks suspiciously like HTML ...)

The purpose of the Content-type field is to distinguish between different types of information. Web browsers can generally only accept certain types of file (notably HTML, text, and some image formats). However, if they are informed what the content-type of a file is before it comes in, they can take appropriate action. This usually consists of checking a configuration mechanism that maps content-types to external programs that can deal with that kind of content; the browser then sends the incoming undigestible data to an application that can do something with it.

For example, a Microsoft Rich Text Format (RTF) file is probably meaningless to a web server and to a browser. However, if the web server is configured to signal that files ending in ".rtf" are of content-type application/rtf, and the web browser is configured to associate application/rtf with Microsoft Word, the file can be handed over to Microsoft Word for display.

Note that it is possible to specify a Content-type as being "multipart". This means that several distinct objects are enclosed in the MIME-encapsulated data. (This is important for server-push documents, described later.)


[ Comments ] [ Copyright ] [ Chapter contents ] [ Book contents ]