RegExp: want to find all links that do not end in ".html"

Posted by grovel on Stack Overflow See other posts from Stack Overflow or by grovel
Published on 2010-03-25T11:07:34Z Indexed on 2010/03/25 11:13 UTC
Read the original article Hit count: 388

Hi,

I'm a relative novice to regular expressions (although I've used them many times successfully). I want to find all links in a document that do not end in ".html" The regular expression I came up with is:

href=\"([^"]*)(?<!html)\"

In Notepad++, my editor, href=\"([^"]*)\" finds all the links (both those that end in "html" and those that do not). Why doesn't negative lookbehind work?

I've also tried lookahead:

href=\"[^"]*(?!html\")

but that didn't work either.

Can anybody help?

Cheers, grovel

© Stack Overflow or respective owner

Related posts about regex

Related posts about negative-lookahead