Iteration through the HtmlDocument.All collection stops at the referenced stylesheet?
Posted
by Jonas
on Stack Overflow
See other posts from Stack Overflow
or by Jonas
Published on 2010-03-19T12:28:58Z
Indexed on
2010/03/19
12:31 UTC
Read the original article
Hit count: 179
Since "bug in .NET" is often not the real cause of a problem, I wonder if I'm missing something here.
What I'm doing feels pretty simple. I'm iterating through the elements in a HtmlDocument
called doc
like this:
System.Diagnostics.Debug.WriteLine("*** " + doc.Url + " ***");
foreach (HtmlElement field in doc.All)
System.Diagnostics.Debug.WriteLine(string.Format("Tag = {0}, ID = {1} ", field.TagName, field.Id));
I then discovered the debug window output was this:
Tag = !, ID =
Tag = HTML, ID =
Tag = HEAD, ID =
Tag = TITLE, ID =
Tag = LINK, ID =
... when the actual HTML document looks like this:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<title>Protocol</title>
<link rel="Stylesheet" type="text/css" media="all" href="ProtocolStyle.css">
</head>
<body onselectstart="return false">
<table>
<!-- Misc. table elements and cell values -->
</table>
</body>
</html>
Commenting out the LINK
tag solves the issue for me, and the document is completely parsed. The ProtocolStyle.css
file exist on disk and is loaded properly, if that would matter. Is this a bug in .NET 3.5 SP1, or what? For being such a web-oriented framework, I find it hard to believe there would be such a major bug in it.
© Stack Overflow or respective owner