regex to match specific html tags

Posted by Rco8786 on Stack Overflow See other posts from Stack Overflow or by Rco8786
Published on 2011-01-16T18:37:14Z Indexed on 2011/01/16 18:53 UTC
Read the original article Hit count: 150

Filed under:
|

I need to match html tags(the whole tag), based on the tag name.

For script tags I have this:

<script.+src=.+(\.js|\.axd).+(</script>|>)

It correctly matches both tags in the following html:

<script src="Scripts/JScript1.js" type="text/javascript" />
<script type="text/javascript" src="Scripts/JScript2.js" />

However, when I do link tags with the following:

<link.+href=.+(\.css).+(</link>|>)

It matches all of this at once(eg it returns one match containing both items):

<link href="Stylesheets/StyleSheet1.css" rel="Stylesheet" type="text/css" />
<link href="Stylesheets/StyleSheet2.css" rel="Stylesheet" type="text/css" />

What am I missing here? The regexes are essentially identical except for the text to match to?

Also, I know that regex is not a great tool for HTML parsing...I will probably end up using the HtmlAgilityPack in the end, but this is driving me nuts and I want an answer if only for my own mental health!

© Stack Overflow or respective owner

Related posts about .NET

Related posts about regex