python regular expressions, how to extract longest of overlapping groups
Posted
by xulochavez
on Stack Overflow
See other posts from Stack Overflow
or by xulochavez
Published on 2010-05-14T15:09:00Z
Indexed on
2010/05/14
15:44 UTC
Read the original article
Hit count: 190
Hi
How can I extract the longest of groups which start the same way
For example, from a given string, I want to extract the longest match to either CS or CSI.
I tried this "(CS|CSI).*" and it it will return CS rather than CSI even if CSI is available.
If I do "(CSI|CS).*" then I do get CSI if it's a match, so I gues the solution is to always place the shorter of the overlaping groups after the longer one.
Is there a clearer way to express this with re's? somehow it feels confusing that the result depends on the order you link the groups.
© Stack Overflow or respective owner