Getting text between quotes using regular expression

Posted by Camsoft on Stack Overflow See other posts from Stack Overflow or by Camsoft
Published on 2010-04-27T17:07:30Z Indexed on 2010/04/27 17:13 UTC
Read the original article Hit count: 436

Filed under:
|
|

I'm having some issues with a regular expression I'm creating.

I need a regex to match against the following examples and then sub match on the first quoted string:

Input strings

("Lorem ipsum dolor sit amet, consectetur adipiscing elit.")

('Lorem ipsum dolor sit amet, consectetur adipiscing elit. ')

('Lorem ipsum dolor sit amet, consectetur adipiscing elit. ', 'arg1', "arg2")

Must sub match

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Regex so far:

\((["'])([^"']+)\1,?.*\)

The regex does a sub match on the text between the first set of quotes and returns the sub match displayed above.

This is almost working perfectly, but the problem I have is that if the quoted string contains quotes in the text the sub match stops at the first instance, see below:

Failing input strings

("Lorem ipsum dolor \"sit\" amet, consectetur adipiscing elit.")

Only sub matches: Lorem ipsum dolor

("Lorem ipsum dolor 'sit' amet, consectetur adipiscing elit.")

The entire match fails.

Notes

The input strings are actually php code function calls. I'm writing a script that will scan .php source files for a specific function and grab the text from the first parameter.

© Stack Overflow or respective owner

Related posts about regex

Related posts about string-manipulation