regex to match postgresql bytea
Posted
by filiprem
on Stack Overflow
See other posts from Stack Overflow
or by filiprem
Published on 2010-03-01T11:19:30Z
Indexed on
2010/06/13
9:42 UTC
Read the original article
Hit count: 365
In PostgreSQL, there is a BLOB datatype called bytea. It's just an array of bytes.
bytea literals are output in the following way:
'\\037\\213\\010\\010\\005`Us\\000\\0001.fp3\'\\223\\222%'
See PostgreSQL docs for full definition of the format.
I'm trying to construct a Perl regular expression which will match any such string.
It should also match standard ANSI SQL string literals, like 'Joe'
, 'Joe''s Mom'
, 'Fish Called ''Wendy'''
It should also match backslash-escaped variant: 'Joe\'s Mom'
, .
First aproach (shown below) works only for some bytea representations.
s{ ' # Opening apostrophe
(?: # Start group
[^\\\'] # Anything but a backslash or an apostrophe
| # or
\\ . # Backslash and anything
| # or
\'\' # Double apostrophe
)* # End of group
' # Closing apostrophe
}{LITERAL_REPLACED}xgo;
For other (longer ones, with many escaped apostrophes, Perl gives such warning:
Complex regular subexpression recursion limit (32766) exceeded at ./sqa.pl line 33, <> line 1.
So I am looking for a better (but still regex-based) solution, it probably requires some regex alchemy (avoiding backreferences and all).
© Stack Overflow or respective owner