Get href tags in html data in c#

Posted by Nani on Stack Overflow See other posts from Stack Overflow or by Nani
Published on 2010-04-02T09:02:55Z Indexed on 2010/04/02 9:13 UTC
Read the original article Hit count: 208

Filed under:
|

I am using web client class to HTML data from a web page. Now I want to get the complete href tags and there titles from the HTML data. Initially I used loops, Felling inefficient I switched to regExp, but dint got efficient solution.

He is the initial code:

for (int i = 0; i < htmldata.Length - 5; i++)

            {
                if (htmldata.Substring(i, 5) == "href=")
                {

                    n1 = htmldata.Substring(i + 6, htmldata.Length - (i + 6)).IndexOf("\"");
                    Sublink = htmldata.Substring(i + 6, n1);
                    var absoluteUri = new Uri(baseUri, temp);
                    n2 = htmldata.Substring(i + n1 + 1, htmldata.Length - (i + n1 + 1)).IndexOf("<");
                    subtitle = htmldata.Substring(i + 6 + n1 + 2, n2 - 7); 

}

}

This code is getting some of the links like this.

/l.href.replace(new RegExp(

/advanced_search?hl=en&q=&hl=en&

and titles like this

onclick=gbar.qs(this) class=gb2>Photos

")+"q="+encodeURIComponent(b)})}i.qs=n;function o(a,b,d,c,f,e){var g=document.getElementById(a);if(g){var

Which are absolutely invalid. Please suggest me the correct code for getting valid relative href links and titles.

ThankYou.

© Stack Overflow or respective owner

Related posts about web

Related posts about c#