Extracting URLs (to array) in Ruby
Posted
by FearMediocrity
on Stack Overflow
See other posts from Stack Overflow
or by FearMediocrity
Published on 2010-04-07T11:59:59Z
Indexed on
2010/04/07
12:03 UTC
Read the original article
Hit count: 302
Good afternoon,
I'm learning about using RegEx's in Ruby, and have hit a point where I need some assistance. I am trying to extract 0 to many URLs from a string.
This is the code I'm using:
sStrings = ["hello world: http://www.google.com", "There is only one url in this string http://yahoo.com . Did you get that?", "The first URL in this string is http://www.bing.com and the second is http://digg.com","This one is more complicated http://is.gd/12345 http://is.gd/4567?q=1", "This string contains no urls"]
sStrings.each do |s|
x = s.scan(/((http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(([0-9]{1,5})?\/.[\w-]*)?)/ix)
x.each do |url|
puts url
end
end
This is what is returned:
http://www.google.com
http
.google
nil
nil
http://yahoo.com
http
nil
nil
nil
http://www.bing.com
http
.bing
nil
nil
http://digg.com
http
nil
nil
nil
http://is.gd/12345
http
nil
/12345
nil
http://is.gd/4567
http
nil
/4567
nil
What is the best way to extract only the full URLs and not the parts of the RegEx?
Thanks
Jim
© Stack Overflow or respective owner