How can I fix this regex to allow a specific string?

Posted by Sailing Judo on Stack Overflow See other posts from Stack Overflow or by Sailing Judo
Published on 2010-03-16T19:49:32Z Indexed on 2010/03/16 19:51 UTC
Read the original article Hit count: 169

Filed under:
|

This regex comes from Atwood and is used to filter out anchor tags with anything other than the href and a title:

 <a\shref="(\#\d+|(https?|ftp)://[-A-Za-z0-9+&@#/%?=~_|!:,.;]+)"(\stitle="[^"]+")?\s?>

I need to allow am additional attribute that specifically matches: target="_blank". So the following url should be allowed:

 <a href="http://www.google.com" target="_blank">

I tried changing the pattern to these:

 <a\shref="(\#\d+|(https?|ftp)://[-A-Za-z0-9+&@#/%?=~_|!:,.;]+)"(\stitle="[^"]+")(\starget="_blank")?\s?>
 <a\shref="(\#\d+|(https?|ftp)://[-A-Za-z0-9+&@#/%?=~_|!:,.;]+)"(\stitle="[^"]+")(\starget=\"_blank\")?\s?>

Clearly I don't know regex very well. How should the pattern be adjusted to allow the blank target and no other targets?

© Stack Overflow or respective owner

Related posts about regex

Related posts about sanitization