December 2000 Column


[ Site Index] [ Linux Index] [ Feedback ]


Why bother with a desktop, anyway?

Every once in a while it's a good idea to take a step back from the coal face, look around, and try to decide whether or not you're mining in the right direction. Which leads me, this once, to question the wisdom of some of the Linux distributors who, in their zeal to provide a fully integrated graphical desktop for new users, are losing sight of something important.

These days, almost all of us use computers that have a WIMP -- window, icon, mouse, pointer -- interface. These so-called GUI's (graphical user interfaces) are supposed to make it easier to work with a computer by providing a visual representation of the objects and processes running on it. The GUI was first implemented on a production computer by Xerox who fumbled the ball, allowing Steve Jobs of Apple to revolutionise the industry with the Apple Lisa (and its successor, the Macintosh).

When the Lisa was launched, personal computers had acquired keyboards and monitors (replacing the teletype machines and punched card readers of yore), but the dominant mode of user interaction was the command prompt; you typed a command and the computer obeyed. All of which worked fine, as long as you knew enough about the computer to know what command to type, and what it would make the machine do.

The GUI revolution made the computer transparent -- you grabbed hold of a file and dropped it onto an icon representing your word processor, for example, and instead of having to memorize arcane keystroke combinations to edit text, you could choose them from a pull-down menu. The effect wasn't only skin-deep; it changed the semantics of user interaction at a very deep level. In effect, instead of remembering a verb and typing it, then typing a noun to apply it to, you could pick a noun from a list in a window and then choose a verb to apply to it from a pull-down menu.

But was this necessarily a good thing?

In one respect, yes: a graphical, menu-driven interface exposes the actions you can make it perform, so that you can sit down in front of an unfamiliar application and (if you have a vague idea of what it's supposed to do) hopefully make it do something useful without first having to memorize a lexicon of commands. This enables a new user to start using a program much faster, lowering the barrier to acceptance of computers in working culture.

On the other hand, there are drawbacks. The externalised interface provides an illusion of simplicity which evaporates when it is applied to a highly complex or configurable program: I defy anyone who hasn't used an image editor before to sit down in front of PhotoShop and, purely by looking at the menus and dialog boxes, figure out how to do something useful! The situation only gets worse when you look at an application with a badly designed GUI -- one that doesn't give the user a good metaphor for the underlying data object they're manipulating, or that is internally inconsistent, or that makes irrelevent frills easy to use while concealing important functions under several layers of sub-menus. It's worth taking a salutory look at the user interface hall of shame -- pretty icons and window decorations don't make for a useable user interface.

And it gets worse. GUIs are poor tools for certain types of task. For example, think about moving files from one directory to another. In most file managers (such as the MacOS finder, Windows Explorer, GNOME's Midnight Commander, or KDE's Konqueror) it's easy to select all the files in one directory and drag them into another. It's also easy to select a single file and drag it. However, if you have a directory containing several hundred files and want to move only the files ending in '.html' and beginning with the letter 'M' or a digit in the range 5-9, you're going to have trouble. You can tell your source window to list files by name in alphabetical order. Then you can manually select the files you want to copy, and drag them. But it's far simpler just to type:

  cp [M5-9]*.html ../destination_directory
The GUI metaphor doesn't lend itself to this sort of declarative interaction -- where you selectively define the data you're operating on and apply a command to it. Nor does the GUI metaphor scale well. For a directory containing a few tens or hundreds of files, you can do the drag and drop thing; but what if you're dealing with a huge directory (thousands of files in size)? GUIs tend to require a manual interaction between the user and the target of their attentions. By replacing the abstract command-driven interface with one that makes the contents of the computer concrete, the GUI succeeds in chaining the expert user down and forcing them to attempt to accomplish manually tasks which are better carried out automatically.

A much more devastating example of what GUIs do wrong: imagine you have a directory containing hundreds of files, and that you want to take all the files ending in '.html' and beginning with the letter 'M' or the digits 5 to 9, and rename them so that they end in '.htm' instead. To rename a single file is easy enough; click on its name and type. To rename hundreds that way is a nightmare. But using the Linux command line (more accurately, a Bash shell prompt) you can type the following:

  for thisfile in [M5-9]*.html
  do
    mv $thisfile $(basename $thisfile .html).htm
  done
The Bash shell actually provides a programming language based on string substitution which lets us take the output of one program and use it as the input to another. Most of the time we just type single commands at a shell prompt, but we can feed tiny programs to it, like this one which loops over every filename that matches the pattern [M5-9].html, and mv's (renames) the file to the output of the command "basename $thisfile .html" (which is the name of the file with the .html suffix stripped off it) with the new suffix ".htm" appended.

You just can't do this with a graphical file manager -- unless one of the designers has provided you with a "mass renaming" tool that lets you specify a pattern of filenames to match and some sort of transformation to apply to them. In which case, they've broken the consistent user interface of the file manager by adding a 'hidden' power-user command, the meaning of which isn't immediately obvious by looking at the visual representation of the directory.

Now, this is not an argument against graphical desktop file managers. These tools are very useful, in their place. They make it easy for novice users to carry out basic operations on limited numbers of objects. More importantly, they make it possible for slightly more experienced users to quickly pick up a new piece of software and, if it isn't too complicated, figure out how to use it. Consider the KDE desktop and its tools; these tools follow a common user interface specification, provide generally similar commands via a generally similar menu-based interface, and consequently make it easy for a user to switch from one program to another. But, in racing headlong to embrace a graphical desktop, Linux distributors are doing their customers a disservice on another level.

Users of Windows and MacOS tend to look down on command line environments. The Mac doesn't have one in the first place, and Windows has the abomination that is MS-DOS hidden from view but occasionally seen to create a public embarrassment, like a senile old relative locked in an attic. The MS-DOS COMMAND.COM interpreter is pathetically underpowered; it added minimal batch execution features to the 1970's vintage CP/M loader, and has effectively been a design orphan since 1990. Nobody wants to remedy its numerous defects, and indeed Microsoft has exiled it from Windows Millennium Edition in the hope that nobody will notice its absence.

Among the many sins of COMMAND.COM were its lack of string processing capabilities, inability to provide true i/o redirection or job control, and the fact that MS-DOS's parameter passing model was fundamentally broken. (Oh, and we'll pass over the paucity of its batch file language and the lack of editing features.) The point to bear in mind is that the DOS/Windows command line is to a modern UNIX shell environment as a clapped-out East German Trabant is to a BMW.

For starters, the old DOS shell had very minimal pattern-matching facilities for identifying the files to which a command was to be applied. Under DOS you can use a question mark '?' to specify a single character -- any character. You can use an asterisk '*' to specify one or more characters bounded by the start or end of a filename, or by a period '.'. The UNIX and Linux shells are all a bit more sophisticated. In addition to the question mark, the asterisk has a subtly different meaning -- match any run of characters other than a period. And the shells can cope with 'sets' -- the set [A-Mn-z] means "match a single character in the range A-M or n-z", and [^A-Mn-z] means "match a single character that is not in this set". More importantly, when you hit the return key at the end of a shell command, any patterns in the line that are eligible to match files are 'expanded' -- replaced with the list of files that they match -- and the line is then executed. If you want to prevent a pattern being expanded, you can put it in single quotes -- the quotes are stripped off, but the contents are treated as a single lexical symbol by the shell interpreter. So if you have a file called "my file", you can apply a Linux command to it even though the space would normally lead the shell to assume you're talking about two items called "my" and "file".

For seconds, there's a huge range of command-line tools available on an ordinary Linux (or UNIX) system. With a handful of notable exceptions (mostly archivers like tar, cpio, and dd, or interactive editors like vi or fdisk) these tools have a fairly uniform syntax. You can set zero or more options using command-line flags; then they expect a list of input files to process, or read from their standard input. Errors are emitted on the standard error filehandle, while valid output comes out of the standard output filehandle. You can daisy-chain these programs by piping their standard output to the standard input of another program, building arbitrarily long pipelines:

  ls -l | sort | more
or you can divert the standard error (in this case, to the bit bucket):
  grep 'search string' * 2> /dev/null
or merge standard output and standard error into a single stream:
  perl myscript.pl 2>&1 | less
(less and more are both pagers -- you use them to scroll through long lists of output, viewing them on a terminal.)

You can also tell a shell to start a job and return immediately, leaving the program running in the background:

   perl some_long_database_run.pl &
(The ampersand '&' indicates that the shell is to return immediately rather than waiting for the preceding command to complete).

If you've got multiple jobs running in the background, the Bash shell (and others such as the Korn shell ksh, Zorn shell zsh, and TCshell tcsh) give you job control -- the ability to suspend, kill, re-prioritize, re-start, and move background jobs back to the terminal. If you missed off the '&' at the end of the command above, you can suspend the program by typing a control-Z character, then type 'bg %1' to send job %1 (the suspended one) to the background. If the job stops because it wants some user input, the command 'fg %1' brings it to the foreground again.

You can capture the output from a program by using backquotes (like `this`), or the $(command) notation -- the command in parentheses is replaced by its output, which is interpolated into some other command. For example:

  mv fred.txt $(basename fred.txt .txt).html
is expanded to
  mv fred.txt fred.html
because the output of the command 'basename fred.txt .txt' is the string 'fred'.

There are other shell facilities built in. You can define aliases for frequently used command lines. A common and useful one on Linux is:

  alias ls='ls --color'
Every time you type the command ls, it will be expanded to ls --color; on a properly-equipped terminal it will then show you different filetypes distinguished by colour.

The ls command basically returns a list of the properties of the files specified on its command line -- when you type 'ls *' the asterisk expands to a list of all files in the current directory. (If you specify no files, the '*' pattern is assumed by default.) But there are other ways of selecting files. In particular, the find command is important. find is one of the odd UNIX commands that have a deviant syntax. It takes one parameter -- a directory to start searching from -- and then descends the directory tree below the starting point. What it does depends on any additional arguments you specify. For example, if you use the '-print' argument, it prints the name of everything it finds that has satisfied all previous arguments; the '-type f' parameter specifies that only files will be selected, while '-type d' specifies only directories. For example:

   find /etc -type d -print
will print a list of all directories below /etc, while:
  du -s $(find /etc -type d -print)
will return the number of disk blocks (of 1Kb) allocated to each directory named by the find command. (A good way of finding directories that contain unexpected piles of rubbish, this.)

You can add further filters to this sort of expression:

  du -s $(find / -type d -print | grep 'tmp')
or
  du -s $(find / -type d -name '*tmp*' -print )
Prints the size of files in all directories with 'tmp' in their name. Or you can do it another way:
 find /etc -type d -name '*tmp*' -exec du -s {} \;
The -exec flag executes the following command (up to the \; marker) on any file that satisfies the previous arguments to find -- the name of the file is interpolated in place of the {} marker.

(I prefer to use the $() notation as this only starts up one copy of the du (disk usage) program -- if we use the find expression above, we fire up a separate copy of du for every directory we find that matches our search.)

The point of these examples is to illustrate the fact that the Linux command line environment is actually a lot more powerful than you'd expect if the only command line you've met before is DOS. The command environment and tools actually let you do tasks that are difficult, if not impossible, using a graphical user interface. For example:

   find /home/httpd/html -type f -name '*.htm*' -print | dehtml | wc -w
Returns an aggregate word count for all the words found in the plain text of the HTML files below the directory /home/httpd/html (presumably all the text stored on a web server).

On the Linux command line, this is a one-line command. In a GUI, this task is next to impossible -- unless you have a drag-and-drop word counter widget, or feel up to writing some macros in VBA or AppleScript or whatever task automation language your GUI has provided for accomplishing those tasks that simply break the transparent, graphical model of interacting with a computer.

The UNIX philosophy has always been "lots of little commands, with a uniform syntax, each of which does one task and does it well". These commands were designed to be strung together as data sources and filters, using the loose glue of a command processor shell that actually provides a pattern-substitution language. For really complex tasks it is often better to write a program in a full-blown programming language with operating system specific extensions, such as Perl, Python, or C; but the shells let you accomplish most of the work you need to do with a minimum of fuss. They also have a huge advantage over any GUI in terms of space, memory, and speed; you can cram a full command-line tool environment and linux kernel onto a couple of floppy disks, and run it on a 386 with 4Mb of memory, but a modern GUI may take ten to a hundred times the disk space, memory, and CPU power in order to give acceptable performance.

The quid pro quo for using these power tools is that you need to make a minimal effort to acquaint yourself with the commonest commands, and you need to understand how the UNIX directory structure is laid out, how we refer to files by name, and a little bit about how shells interpret the commands we type at them. A good place to start with this is any old UNIX tutorial -- emphasis on 'old', ten years or more is a good bet for a book that will cover command-line tools rather than icons and mice. Alternatively, O'Reilly and Associates publish "Linux in a Nutshell" (edited by Jessical Perry Hekman; ISBN 1-56592-167-4) -- this provides a reference guide to the fundamental user commands and shells in just one slim volume, and unless you're an old hand you need to keep a copy on your desktop -- the wood or plastic one, not the one on your monitor.

Understanding how to use computers is out of fashion this decade, in the same way that understanding how to fix a puncture or unblock a drain is out of fashion -- people seem to expect these tasks to be done for them. Nevertheless, there's no substitute for a bit of understanding. In many cases, the availability of a graphical user interface hypnotises people into thinking that there's no other way of working; I've heard of people who were expected, in their job, to spend days renaming collections of files one at a time by use of a mouse, simply because it didn't occur to anyone present at those companies that there was another, better, way of doing things.


[ Site Index] [ Linux Index] [ Feedback ]