Chapter 3: Understanding HTTP

Forms, ISINDEX, and tags

[ Comments ] [ Copyright ] [ Chapter contents ] [ Book contents ]

As we've seen, the HTTP protocol supports much more than just hypertext browsing. Its facilities for accepting data permit it to interact with client software in a variety of ways. We are now in a position to look at some of the more complex issues surrounding the web: namely, forms and scripts that run on the server, searchable document sets, tags that refer to document metainformation, and server push/client pull.
These aspects of the web are the most difficult to deal with, but among the most rewarding; if you can give your users access to forms you can provide central access to some useful facilities -- you can sell things to them, for example, or let them use a centrally-maintained software tool, or make them tell you their name before you give them access to information. If you can make your web searchable, you can multiply its usefulness massively; navigating through hyperlinks forces your users to follow the railroad tracks of someone else's thoughts, but if they can search for document contents they can drive wherever they want to go. And document metainformation -- information about the structure of documents -- can be used to do some really nifty things (as described later).
Web servers can do more than serve documents; they can execute other programs, and return the results to a browser. These programs are frequently called CGI scripts (after the Common Gateway Interface, which they use to communicate with the web server). For purposes of this discussion, programs executed via the CGI interface may be referred to as scripts, because many such programs are written in interpreted languages such as Perl. (Programs written in interpreted languages are frequently called scripts, to distinguish them from compiled binary programs.)
NOTE Two other types of programmatic facility exist on web servers. These are: server-side includes (described in a later chapter) and client-side applets (also described later). For the sake of simplicity, we're going to ignore these until we have examined the CGI interface in some detail.

When an HTTP server receives a request that invokes a program, as opposed to a file, it runs the appropriate program, and sends its output back to the requester.
To allow useful interaction between a user and a script on a server, some sort of mechanism is needed to allow users to enter information that can then be transmitted to the server. HTML is fixed text; a dynamic, changeable medium is needed. In the HTML 2.0 standard, this is provided by forms.
Using HTML forms and CGI scripts, it is possible to interface an HTTP server to a big database so that it acts as a front-end, giving users many of the benefits of a distributed database without the corresponding disadvantages.

Web browsers and CGI scripts

The relationship between a form and a CGI script is mediated through HTTP, using the web server as matchmaker. A form is an HTML document; it contains a URL that points to a CGI script (rather than another HTML page). When you click this link (using a button on the form), your browser sends a complex request to the server, enclosing a package of data from the form. The server in turn runs the CGI script and feeds the data to it. The CGI script does something with the data, and prints some HTML (or another MIME content-type) to the standard output device; the web server reads this, and returns it to your browser.

Flow of data in an HTTP request

HTML 3.0 (which we haven't seen yet -- the standard is not yet set, as of the time of writing) will have even more features. We'll look at some of them in a subsequent chapter.

[ Comments ] [ Copyright ] [ Chapter contents ] [ Book contents ]