Square Brackets in Python Regular Expressions (re.sub)
Posted
by
user1479984
on Stack Overflow
See other posts from Stack Overflow
or by user1479984
Published on 2012-06-25T15:08:05Z
Indexed on
2012/06/25
15:15 UTC
Read the original article
Hit count: 219
I'm migrating wiki pages from the FlexWiki engine to the FOSwiki engine using Python regular expressions to handle the differences between the two engines' markup languages.
The FlexWiki markup and the FOSwiki markup, for reference.
Most of the conversion works very well, except when I try to convert the renamed links. Both wikis support renamed links in their markup.
For example, Flexwiki uses:
"Link To Wikipedia":[http://www.wikipedia.org/]
FOSwiki uses:
[[http://www.wikipedia.org/][Link To Wikipedia]]
both of which produce something that looks like
I'm using the regular expression
renameLink = re.compile ("\"(?P<linkName>[^\"]+)\":\[(?P<linkTarget>[^\[\]]+)\]")
to parse out the link elements from the FlexWiki markup, which after running through something like
"Link Name":[LinkTarget]
is reliably producing groups
<linkName> = Link Name
<linkTarget = LinkTarget
My issue occurs when I try to use re.sub to insert the parsed content into the FOSwiki markup.
My experience with regular expressions isn't anything to write home about, but I'm under the impression that, given the groups
<linkName> = Link Name
<linkTarget = LinkTarget
a line like
line = renameLink.sub ( "[[\g<linkTarget>][\g<linkName>]]" , line )
should produce
[[LinkTarget][Link Name]]
However, in the output to the text files I'm getting
[[LinkTarget [[Link Name]]
which breaks the renamed links.
After a little bit of fiddling I managed a workaround, where
line = renameLink.sub ( "[[\g<linkTarget>][ [\g<linkName>]]" , line )
produces
[[LinkTarget][ [[Link Name]]
which, when displayed in FOSwiki looks like
<[[Link Name> <--- Which WORKS, but isn't very pretty.
I've also tried
line = renameLink.sub ( "[[\g<linkTarget>]" + "[\g<linkName>]]" , line )
which is producing
[[linkTarget [[linkName]]
There are probably thousands of instances of these renamed links in the pages I'm trying to convert, so fixing it by hand isn't any good. For the record I've run the script under Python 2.5.4 and Python 2.7.3, and gotten the same results.
Am I missing something really obvious with the syntax? Or is there an easy workaround?
© Stack Overflow or respective owner