PHP regular expression subpattern behaviour

Posted by codecowboy on Stack Overflow See other posts from Stack Overflow or by codecowboy
Published on 2010-06-15T09:37:01Z Indexed on 2010/06/15 9:42 UTC
Read the original article Hit count: 211

Filed under:
|

I want to match both the src and title attributes of an image tag:

pattern:

<img [^>]*src=["|\']([^"|\']+["|\'])|title=["|\']([^"|\']+)

target:

<img src="http://someurl.jpg" class="quiz_caption" title="Caption goes here!">

This pattern gives me one unwanted match, title="content", and the match I actually want which is the value between the quotes after the word 'title', i.e 'content'.

So, my matches are:

<img src="http://someurl.jpg
http://someurl.jpg
title="Caption goes here!"
Caption goes here!

Is there a way to avoid the third of these matches? I'm using PCRE in PHP 5.2.x

© Stack Overflow or respective owner

Related posts about php

Related posts about regex