References


[ Site Index] [ Attic Index] [ Perl Index] [ Feedback ]


I follow comp.lang.perl.misc a lot of the time. Sometimes I even try to answer questions there. I'm not an expert perler of the first waters -- I know just enough to bluff my living is a perl programmer -- but I like being able to help someone achieve enlightenment (on those rare occasions when I understand what's going on).

Here's my take on references, a useful but mind-mangling feature of Perl 5 that is fundamental to an understanding of complex data structures and objects in the latest version of this language.


Newsgroups: comp.lang.perl.misc
Subject: Re: how to read all of a 2D assoc array
References:  &#lt;NEWTNews.825095068.7085.pp002530@interramp.com>
Reply-To: charles@fma.com
Followup-To: 
In article &#lt;NEWTNews.825095068.7085.pp002530@interramp.com>, name_withheld@interramp.com wrote:
>
>The following code fragment causes Perl to complain that the (presumptive) 
>associative array slice is unacceptable (in the context?).  Would someone 
>answer this beginner's question: What's the ... sorry, what's a correct way to 
>work though all the elements of a properly-defined two-dimensional associative 
>array?
>
>The following code fragment causes Perl to complain that the (presumptive) 
>associative array slice is unacceptable (in the context?).  Would someone 
>answer this beginner's question: What's the ... sorry, what's a correct way to 
>work though all the elements of a properly-defined two-dimensional associative 
>array?
>
>foreach $pt (keys %alias) {                                                   
>  foreach $x (keys @alias{$pt}) {                                              
>        print "ALIAS FOR $pt = ", $alias{$pt}{$x}, "\n";                       
>  }                                                                            
> }
You're making a fundamental mistake here, in that you're assuming perl 5 supports two-dimensional associative arrays. (Don't worry, I made the same mistake myself until recently.) In a nutshell, perl 5 does NOT support two dimensional associative arrays. What it DOES support are references, and you can use references to create structures equivalent to a two-dimensional (or n-dimensional) array.

How do you do this?

Well, the details are described in the perlref manpage, and you can pick up more stuff on Tom Christiansen's Perl Data Structures Cookbook (at http://www.perl.com/). However, I found those explanations a bit confusing at first. So here's my take on what's going on ... doubtless if I screw this up someone will let me know in no uncertain terms!

I'm going to assume that you're happy about perl's three basic data structures; the scalar variable ($foo), the array (@foo), and the associative array (%foo). If you aren't happy with these concepts, save this posting for later reference but don't bother reading it now because it won't make much sense to you.

When you define a variable, for example by assigning a value to it, like:

    $foo = "bar";
Perl creates an entry in its symbol table to hold this variable. The entry is called "foo" and its value is "bar". In effect, you can think of the symbol table as being a whacking great associative array that perl uses to hold all the variables in a running process. (It's a bit more complex than that, because it can also store arrays and associative arrays -- hereafter termed hashes -- but let's not worry about that just yet.)

You can actually dump the symbol table of a perl process and look at all the variables defined in it; under perl 4, you'd use the package dumpvar.pl (see the Camel book, p 394 (at least in my first printing copy)). This is more or less how the perl debugger gets access to a running program.

I've been talking about symbol tables in the singular, but actually a perl program can have more than one symbol table. Every time you use a package, you're defining variable names within the scope of a new symbol table -- one associated with that package. Each package has its own symbol table, which effectively amounts to a fresh name space -- if $foo is defined in the symbol table of the main program, but the main program uses package bar, it's possible that package bar has an entirely separate variable called $foo. You can refer to it from the main program as $bar::foo -- and you can examine the main symbol table's $foo from within package bar by referring to $main::foo.

(In perl 4 the syntax was different: $bar`foo or $main`foo, but the principle was the same.)

Now, if a symbol table contains variables, with associated values, it's fairly obvious that we can get at a variable by refering to its name: $foo gives us the contents of variable foo. But why can't we do indirection? For example:

$foo = "bar";
$bar = "quux";
print $$foo;
should intuitively print "quux". (That is: "$$foo" is replaced by "$bar" which is replaced by the value of variable "bar".)

Well, in Perl 5, we can do this.

$$foo is an example of a symbolic reference; we refer to a variable by its name.

In addition to being able to refer to variables by name, we can refer to them by their actual location in the symbol table.

To get the location of a variable, you do something like this:

$bar = "quux";
$foo = \$bar;
print $$foo;
This is equivalent to the earlier example, except that instead of storing the string "bar", $foo contains the address of $bar.

It's a bit like pointers in C; the \$bar bit is a bit like the & (address-of) operator. The main difference is that the C operator returns an address in memory, while the Perl operator returns a 'reference' -- a key into the symbol table.

Now suppose we want to do something a bit more fancy: say, pass an associative array to a subroutine. But instead of using the * operator, or passing the array by value, we want to pass it by reference. We can do this:

%fribble = ( 
     "colour"  => "red",
     "flavour" => "chocolate",
     "day"     => "monday"
);

$result = &my_function(\%fribble);

   :
   :

sub my_function {
    local ($hashref) = @_[$[];  # okay, so I'm not using my() here!
    local $key;

    foreach $key (keys %$hashref) {
        print "$key => $$hashref{$key}\n";
    }
    return 0;
}
When we call my_function with \%fribble, we pass a reference to the hash called fribble into my_function. Inside my_function, this reference is stored locally in $hashref. We can dereference it to get back to the hash by referring to it as %$hashref, and we can get individual values out by dereferncing it in a scalar context; $$hashref{foo} is the same as $fribble{foo}.

The $$hashref{} thing is a bit inelegant. So Perl 5 gives us a shortcut:

 $hashref->{foo}
is the same as
$$hashref{foo}
This notation is fairly simple: $hashref is a reference to a hash, the "->" operator dereferences through it to get to the item it points to. This comes in handy when we consider (finally!) multidimensional hashes.

What is a multidimensional associative array? Really?

Well, a one-dimensional associative array is a hash: each key maps to a unique value within the hash. A two-dimensional array, if such a brute existed, would be a hash that needs two keys to find a given value.

It doesn't exist: but Perl 5 lets us emulate one using references.

Let's suppose we want to build a simple scheduler. There's a range of days, and within each day, a range of times. An event at the same time can occur on different days; the two dimensions are effectively independent of each other. This is a prime candidate for a two-dimensional associative array.

To locate any given event, first we need to find a given day in a hash of days:

$today = $day_of_week{"Thursday"};
But we don't want to retrieve a simple scalar -- we want to retrieve an * entire associative array of times occuring in that day *.

The best way to do this is to use a hash.

$today shouldn't be a scalar; it's a reference to a hash. So to find a given event at time $time on day $day, we need to do something like this:

$event = $day_of_week{$day}->{$time};
We've defined the associative array %day_of_week. Each value in it is a reference to another hash. Another way of writing this is:
$event = ${$day_of_week{$day}}{$time};
(The enclosing ${...}{} around our reference forces it to be interpreted as a reference to a hash. We could force it to be interpreted as a reference to an ordinary array by @{$day_of_week{$day}}.)

Given an event $event occuring at time $time on day $day, we can give it an entry in our scheduler by doing something like this:

$day_of_week{$day}->{$time} = $event;
It doesn't matter if the anonymous hash $day_of_week{$day} refers to doesn't have an entry for $time yet; this will create one.

Actually, this example supposes that we've actually got a known hash called %day_of_week sitting around. We might as well just have a reference called $day_of_week, that points to a hash, each element of which points to another hash. Then we could write:

$day_of_week->{$day}->{$time} = $event;
How do we iterate over all the items in a hash of hashes?

Suppose we're working with %day_of_week. We can get all the keys in it in the usual manner:

foreach $key (keys %day_of_week) {
Alternatively, suppose we've got the reference $day_of_week. We can get the keys in the array it references like this:
foreach $key (keys %$day_of_week) {
Then, for each $key, we dereference it in the context of an associative array:
foreach $day (keys %$day_of_week) {
    foreach $time (keys %{ $day_of_week->{$day} }) {
        # ... do something ...       
    }
}
Now we can look at the first stab at doing this and see what's going wrong:
foreach $pt (keys %alias) {                                                   
    foreach $x (keys @alias{$pt}) { 
        print "ALIAS FOR $pt = ", $alias{$pt}{$x}, "\n"; 
    }   
}
The first foreach loop is fine. It gets us all the keys on the hash %alias.

But the second foreach loop is wrong. @alias is a normal array, but @alias{$pt} is an attempt to refer to it as if it is a hash. Perhaps it was meant to be:

foreach $x (keys $alias{$pt})
But this, too, is wrong. $alias{$pt} evaluates to a scalar; it doesn't have any keys.

You can fix it by making it a reference to another hash:

foreach $x (keys %{ $alias{$pt} })
This gets all the keys in the hash named by $alias{$pt}.

But then you need to dereference the hashes:

print "ALIAS FOR $pt = ", $alias{$pt}->{$x}, "\n";
So the general principle is:

You don't have multi-dimensional arrays. Instead, you have anonymous (nameless, dynamically created) arrays, which are addressed by their references (pointers). References can be stashed in scalars. So you use an associate array keyed off your first independent parameter to store references to anonymous associative arrays keyed off your second independent parameter to emulate two-dimensional associative arrays.


[ Site Index] [ Attic Index] [ Perl Index] [ Feedback ]