regex to match postgresql bytea
- by filiprem
In PostgreSQL, there is a BLOB datatype called bytea. It's just an array of bytes.
bytea literals are output in the following way:
'\\037\\213\\010\\010\\005`Us\\000\\0001.fp3\'\\223\\222%'
See PostgreSQL docs for full definition of the format.
I'm trying to construct a Perl regular expression which will match any such string.
It should also match standard ANSI SQL string literals, like 'Joe', 'Joe''s Mom', 'Fish Called ''Wendy'''
It should also match backslash-escaped variant: 'Joe\'s Mom', .
First aproach (shown below) works only for some bytea representations.
s{ ' # Opening apostrophe
(?: # Start group
[^\\\'] # Anything but a backslash or an apostrophe
| # or
\\ . # Backslash and anything
| # or
\'\' # Double apostrophe
)* # End of group
' # Closing apostrophe
}{LITERAL_REPLACED}xgo;
For other (longer ones, with many escaped apostrophes, Perl gives such warning:
Complex regular subexpression recursion limit (32766) exceeded at ./sqa.pl line 33, < line 1.
So I am looking for a better (but still regex-based) solution, it probably requires some regex alchemy (avoiding backreferences and all).