Why are references compacted inside Perl lists?
- by parkan
Putting a precompiled regex inside two different hashes referenced in a list:
my @list = ();
my $regex = qr/ABC/;
push @list, { 'one' => $regex };
push @list, { 'two' => $regex };
use Data::Dumper;
print Dumper(\@list);
I'd expect:
$VAR1 = [
{
'one' => qr/(?-xism:ABC)/
},
{
'two' => qr/(?-xism:ABC)/
}
];
But instead we get a circular reference:
$VAR1 = [
{
'one' => qr/(?-xism:ABC)/
},
{
'two' => $VAR1->[0]{'one'}
}
];
This will happen with indefinitely nested hash references and shallowly copied $regex.
I'm assuming the basic reason is that precompiled regexes are actually references, and references inside the same list structure are compacted as an optimization (\$scalar behaves the same way). I don't entirely see the utility of doing this (presumably a reference to a reference has the same memory footprint), but maybe there's a reason based on the internal representation
Is this the correct behavior? Can I stop it from happening? Aside from probably making GC more difficult, these circular structures create pretty serious headaches. For example, iterating over a list of queries that may sometimes contain the same regular expression will crash the MongoDB driver with a nasty segfault (see https://rt.cpan.org/Public/Bug/Display.html?id=58500)