Why are references compacted inside Perl lists?

Posted by parkan on Stack Overflow See other posts from Stack Overflow or by parkan
Published on 2010-06-17T23:25:06Z Indexed on 2010/06/17 23:33 UTC
Read the original article Hit count: 298

Filed under:
|
|

Putting a precompiled regex inside two different hashes referenced in a list:

my @list = ();

my $regex = qr/ABC/;

push @list, { 'one' => $regex };
push @list, { 'two' => $regex };

use Data::Dumper;
print Dumper(\@list);

I'd expect:

$VAR1 = [
      {
        'one' => qr/(?-xism:ABC)/
      },
      {
        'two' => qr/(?-xism:ABC)/
      }
    ];

But instead we get a circular reference:

$VAR1 = [
      {
        'one' => qr/(?-xism:ABC)/
      },
      {
        'two' => $VAR1->[0]{'one'}
      }
    ];

This will happen with indefinitely nested hash references and shallowly copied $regex.

I'm assuming the basic reason is that precompiled regexes are actually references, and references inside the same list structure are compacted as an optimization (\$scalar behaves the same way). I don't entirely see the utility of doing this (presumably a reference to a reference has the same memory footprint), but maybe there's a reason based on the internal representation

Is this the correct behavior? Can I stop it from happening? Aside from probably making GC more difficult, these circular structures create pretty serious headaches. For example, iterating over a list of queries that may sometimes contain the same regular expression will crash the MongoDB driver with a nasty segfault (see https://rt.cpan.org/Public/Bug/Display.html?id=58500)

© Stack Overflow or respective owner

Related posts about regex

Related posts about perl