Chapter 5: Introducing Perl

Perl on different Platforms


[ Comments ] [ Copyright ] [ Chapter contents ] [ Book contents ]


Perl is available for most common operating systems. However, if you intend to use a Perl script on more than one platform there are certain things to bear in mind.

UNIX

Perl is written in C and is usually provided in compiled form for those platforms that don't usually come with a C compiler. Most flavours of UNIX come with a C compiler; furthermore, as UNIX systems are not normally binary compatible, you may have to roll your own Perl binary.

Luckily, Perl comes with a complex shell script called Configure. When you run it, it checks your system for various options which may (or may not) be present, asks you some technical questions, then writes a Makefile. You then run the UNIX utility make, which interprets the Makefile and drives the C compiler. Assuming Configure did its job properly, make will build a version of Perl customized to the facilities available on your system. Those services that your operating system does not support are simply not compiled into your copy of Perl.

A test suite is included with the standard Perl distribution. Running the test suite executes a series of Perl scripts which exercise the interpreter and check that the output from it conforms to what is expected of the language.

To make matters a bit easier, if you have a common UNIX system (say, SunOS 4.1, Solaris 2.3, or SCO UNIX) you can probably obtain a precompiled copy of Perl. Indeed, although Perl is not part of UNIX (and is not listed for inclusion in the definitive SPEC1170 standard), many vendors are now shipping Perl with their systems as unsupported software.

Perl was originally designed for UNIX. If you have a good UNIX implementation, you can use all Perl's features. Perl knows no obvious memory limits, can fork() and exec() child programs, can connect to servers across a TCP/IP network, and can take advantage of most of the system's facilities. One of the advantages of Perl on UNIX is that as long as the Perl binary is reasonably well put together, it removes all the usual issues of memory management from the programmer. Writing C source code on a UNIX system condemns the programmer to a life of drudgery, hunting for obscure memory leaks.

MS-DOS

Perl suffers under MS-DOS for a simple reason: DOS cannot provide many of the features Perl requires. There is no standard TCP/IP stack for DOS, DOS doesn't understand multi-tasking, and DOS has very primitive memory management. For this reason, a pure DOS version of Perl is a sorry beast.

Luckily the PC architecture has outgrown its humble origins. Windows takes over memory management from DOS and allows properly written Windows applications to access all the memory on a PC. At the same time, memory-management utilities called DOS extenders provide access to blocks of memory above 1 Mbyte for ordinary DOS applications. BigPerl for DOS can execute under Windows 3.1's DPMI-compatible memory manager. It still doesn't understand TCP/IP, and can only execute one child process at a time, but gives a full 16 Mb of virtual memory for messing around in.

Note also that Perl on DOS suffers from problems associated with filename semantics. The DOS filename separator is a backslash, "\", but Perl views a backslash as an escape character, you must therefore use an escaped backslash "\\" to separate elements in a pathname.

BigPerl is currently only available for Perl 4.036. A Perl 5.0 port is under development.

Macintosh

The Macintosh is an alien beast. However, MacPerl is much friendlier than you might expect. MacPerl is currently available as an integrated environment (a text editor which can execute Perl programs) and a standalone interpreter (that runs under Apple's MPW command-line shell). MacPerl is bound by memory limitations, but these are somewhat less extreme than those for DOS Perl; you can allocate memory up to the maximum amount of virtual memory available on your Macintosh and Perl will use it. MacPerl also has a rather interesting socket-handling package. Unlike standard Perl, MacPerl uses a custom set of socket interfaces called GUSI -- the Grand Unified Socket Interface. GUSI provides a Berkeley sockets interface and supports TCP/IP and the Macintosh networking protocols. Using GUSI you can do most of the TCP/IP stuff that a UNIX Perl program can do.

The eccentricities of MacPerl should be obvious if you think for a moment about the eccentricities of the Macintosh environment. Firstly, the Macintosh filename semantics are sufficiently different from OS or UNIX to make it necessary to modify scripts that open files. Secondly, MacPerl can't execute external programs in the same way as UNIX perl -- although it has an interface to the AppleScript scripting mechanism, so that it can drive scriptable Macintosh applications (and communicate with them), and can interoperate with the MPW ToolServer. Thirdly, there is a package of external commands for MacPerl which you should use with any Macintosh Perl scripts. This lets you prompt for input and display simple dialogues, and load external compiled code fragments (XCMDs and XFCNs). It's essential, because Macintosh programs don't have a run-time environment to pass variables in (like UNIX or DOS),

MacPerl can interoperate tightly with the WebStar web server -- thus allowing CGI scripts in Perl to be ported between Macintosh and UNIX fairly easily. A package to allow emulation of standard CGI scripts is available.

Currently, MacPerl is stable at the level of Perl 4.036 for UNIX. However, work on MacPerl 5.0 is in progress. A version of MacPerl that is usable as an Apple OSAX (native scripting language) may become available in the near future, and support for the Macintosh port of the Tk widget toolkit (currently being carried out by Sun Microsystems) has been suggested.

This small library interconverts Macintosh and UNIX pathnames. If you need to write a Perl script that can run in either environment, it may prove indispensible. Note that these routines rely on regular expressions to do the conversion, so they may look a little opaque at first. A "better" way would be to build a pushdown stack of pathname elements, remove the parents of ".." elements, and then pop all the elements off the stack -- but that's a complex solution to code, and a little slower to execute.

Macintosh pathnames consist of a series of folder names separated by colons, followed by a filename. A leading colon means that the path is "relative" to the current folder. Two colons in a row mean "go up a level"; three in a row mean "go up two levels", and so on.

UNIX pathnames consist of a series of directory names separated by slashes, followed by a filename. A leading slash means that the path is "absolute", from the top of the filesystem. Two dots in a row mean "go up a level"; three in a row is a mistake.


#!/usr/local/bin/perl


sub mac2ux {
    #
    # takes a macintosh pathname and reformats it as a UNIX pathname
    #
    local ($inpath) = @_[$[];      # first argument to &mac2ux
    $inpath = &eat_parents($inpath);    
                                   # convert "::" refs 
    $inpath =~ s#:#/#g;            # change all ":" to "/" in $inpath

    while ($inpath =~ m#//#) {     # while $inpath matches "//" 
        $inpath =~ s#//#/../#g;    # replace "//" with "/../"
    }

    if (substr($inpath,$[,1) eq "/") { # if the first character of $inpath
                                                 # is already a "/", remove it
        $inpath = substr($inpath, $[+1);
    } else {                                     # otherwise, insert a leading "/"
        $inpath = "/" . $inpath;
    }

    return $inpath;                              # return the value of $inpath
}

sub ux2mac {
    #
    # takes a UNIX pathname and reformats it as a Macintosh pathname
    #
    local ($inpath) = @_[$[];
    $inpath =~ s#/#:#g;                          # change all "/" to ":"

    if (substr($inpath,$[,1) eq ":") { # if $inpath begins with a ":"
                                                 # remove the leading colon
        $inpath = substr($inpath, $[+1);
    } else {                                     # otherwise, insert a leading colon
        $inpath = ":" . $inpath;
    }
    # the next substitution is particularly complex. It means:


    # look for matches of two periods and a colon (unless they are preceeded


    # by another colon), and replace them with ":..:"


    # 
    $inpath =~ s#[^:]\.\.:#:..:#;    
    while ($inpath =~ m#:\.\.:#) {              # while the string contains ":..:"
        $inpath =~ s#:\.\.:#::#g;               # replace it with "::"
    }

    $inpath = &eat_parents($inpath);        # now remove "::" pairs and their
                                                # parent directories 
    return $inpath;                             # return the value of $inpath
}

sub eat_parents {
    #
    # this subroutine looks for patterns like (directory1:)(directory2):: 
    # and replaces them with (directory1:)
    #
    local ($line) = @_[$[];
    #
    # let's eat our parent directories wherever we see "::" 
    #
    WHILE:                                    # a loop identifier


    while ($line =~ m#::#) {        # while there are "::" constructs ...
        last WHILE if ($line =~ /^:(:)+[^:]+/);
                                    # quit the loop if there are no
                                    # directories to discard left in $line
                                    #  --  something's probably gone wrong
        if ($line  =~  /(.*:)([^:]+::)(.*)/) {
                                    # if the line contains the target
                                    # pattern
            ($prefix,$target,$suffix)
               = ($line =~ /(.*:)([^:]+::)(.*)/);
                                    # apply the pattern to the line and
                                    # split the chunks into prefix, bit
                                    # to get rid of, and suffix
            $line = $prefix . $suffix;
        } else {
            if ($line  =~ /[^:]*::[^:].*/) {
                                    # if the line consists of 
                                    # (directory::)filename
                ($target,$suffix)
                 = ($line  =~ /([^:]*:)(.*)/);
                                    # get the filename, and return that
                $line  =  $suffix;
                                    # then exit the loop because there
                                    # are obviously no more directories to
                                    # discard
                last WHILE;
            }
        }
    }
    return $line;
}


# test script: running the following script (on UNIX) tests
# the routines above. Type the script's name to start it, then type unix
# pathnames followed by a carriage return. It will convert them to mac, then
# back to UNIX, and tell you if they match. 
# Note that it can't handle multiple "../" operators on one line.

#!/usr/local/bin/perl
while (<>) {
    chop;                                            # chop() dumps last character in a line
    $inp = $_;
    $res = &ux2mac($inp);                     # unix filename -> mac
    print "ux2mac($inp) ==> $res\n";    # print results
    $cmp = &mac2ux($res);                        # turn mac filename back into unix
    print "mac2ux($res) ==> $cmp\n";    # print results
    if ($cmp eq $inp) {                              # say if the back-conversion got back
                                                     # to the original pathname
         print "functions are orthogonal\n";
    } else {
        print "Warning: input not identical to output!\n";
    }
    if ($inp =~ /\.\./) {
        print "This may be because your input contained\n";
        print "the \"..\" operator.\n";
    }
}


[ Comments ] [ Copyright ] [ Chapter contents ] [ Book contents ]