How can I build and parse HTTP URL's / URI's / paths in Perl?
Posted
by Robert S. Barnes
on Stack Overflow
See other posts from Stack Overflow
or by Robert S. Barnes
Published on 2010-04-19T12:13:40Z
Indexed on
2010/04/20
13:13 UTC
Read the original article
Hit count: 201
I have a wget
-like script which downloads a page and then retrieves all the files linked in IMG tags on that page.
Given the URL of the original page and the the link extracted from the IMG tag in that page I need to build the URL for the image file I want to retrieve. Currently I use a function I wrote:
sub build_url {
my ( $base, $path ) = @_;
# if the path is absolute just prepend the domain to it
if ($path =~ /^\//) {
($base) = $base =~ /^(?:http:\/\/)?(\w+(?:\.\w+)+)/;
return "$base$path";
}
my @base = split '/', $base;
my @path = split '/', $path;
# remove a trailing filename
pop @base if $base =~ /[[:alnum:]]+\/[\w\d]+\.[\w]+$/;
# check for relative paths
my $relcount = $path =~ /(\.\.\/)/g;
while ( $relcount-- ) {
pop @base;
shift @path;
}
return join '/', @base, @path;
}
The thing is, I'm surely not the first person solving this problem, and in fact it's such a general problem that I assume there must be some better, more standard way of dealing with it, using either a core module or something from CPAN - although via a core module is preferable. I was thinking about File::Spec
but wasn't sure if it has all the functionality I would need.
© Stack Overflow or respective owner