python regular expressions, how to extract longest of overlapping groups

Posted by xulochavez on Stack Overflow See other posts from Stack Overflow or by xulochavez
Published on 2010-05-14T15:09:00Z Indexed on 2010/05/14 15:44 UTC
Read the original article Hit count: 190

Filed under:
|

Hi

How can I extract the longest of groups which start the same way

For example, from a given string, I want to extract the longest match to either CS or CSI.

I tried this "(CS|CSI).*" and it it will return CS rather than CSI even if CSI is available.

If I do "(CSI|CS).*" then I do get CSI if it's a match, so I gues the solution is to always place the shorter of the overlaping groups after the longer one.

Is there a clearer way to express this with re's? somehow it feels confusing that the result depends on the order you link the groups.

© Stack Overflow or respective owner

Related posts about python

Related posts about regex