How can I find a URL called [link] inside a block of HTML containing other URLs?
- by DrTwox
I'm writing a script to rewrite Reddit's RSS feeds. The script needs to find a URL named [link] inside a block of HTML that contains other URLs. The HTML is contained in an XML element called <description>.
Here are two examples of the <description> element from I need to parse and the [link] I would need to get.
First example:
<description>submitted by <a href="http://www.reddit.com/user/wildlyinaccurate"> wildlyinaccurate </a> <br/> <a href="http://wildlyinaccurate.com/a-hackers-guide-to-git">[link]</a> <a href="http://www.reddit.com/r/programming/comments/26jvl7/a_hackers_guide_to_git/">[66 comments]</a></description>
The [link] is: http://wildlyinaccurate.com/a-hackers-guide-to-git
Second example:
<description><!-- SC_OFF --><div class="md"><p>I work a support role at a company where I primarily fix issues our customers our experiencing with our software, which is a browser based application written primarily in javascript. I&#39;ve been doing this for 2 years, but I want to take it to the next level (with the long term goal being that I become proficient enough to call myself a developer). I&#39;ve been reading &quot;Javascript The Definitive Guide&quot; by O&#39;Reilly but I was wondering if any of you more experienced users out there had some tips on taking it to the next level. Should I start incorporating some PHP and Jquery into my learning? Side projects on my spare time? Any good online resources? Etc. </p> <p>Thanks! </p> </div><!-- SC_ON --> submitted by <a href="http://www.reddit.com/user/56killa"> 56killa </a> <br/> <a href="http://www.reddit.com/r/javascript/comments/26nduc/i_want_to_become_more_experienced_with_javascript/">[link]</a> <a href="http://www.reddit.com/r/javascript/comments/26nduc/i_want_to_become_more_experienced_with_javascript/">[4 comments]</a></description>
The [link] is: http://www.reddit.com/r/javascript/comments/26nduc/i_want_to_become_more_experienced_with_javascript/