How to capture strings using * or ? with groups in python regular expressions

Posted by user1334085 on Stack Overflow See other posts from Stack Overflow or by user1334085
Published on 2012-04-15T04:35:23Z Indexed on 2012/04/15 5:28 UTC
Read the original article Hit count: 229

When the regular expression has a capturing group followed by "*" or "?", there is no value captured. Instead if you use "+" for the same string, you can see the capture.

I need to be able to capture the same value using "?"

>>> str1='This string has 29 characters'

>>> re.search(r'(\d+)*', str1).group(0)
''
>>> re.search(r'(\d+)*', str1).group(1)
>>> 
>>> re.search(r'(\d+)+', str1).group(0)
'29'
>>> re.search(r'(\d+)+', str1).group(1)
'29'

More specific question is added below for clarity:

I have str1 and str2 below, and I want to use just one regexp which will match both. In case of str1, I also want to be able to capture the number of QSFP ports

>>> str1='''4 48 48-port and 6 QSFP 10GigE Linecard 7548S-LC''' 
>>> str2='''4 48 48-port 10GigE Linecard 7548S-LC''' 
>>> 

When I do not use a metacharacter, the capture works:

>>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP).*-LC', str1, re.I|re.M).group(1) 
'6' 
>>>

It works even when I use the "+" to indicate one occurrence:

>>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP)+.*-LC', str1, re.I|re.M).group(1) 
'6' 
>>>

But when I use "?" to match for 0 or 1 occurrence, the capture fails even for str1:

>>> re.search(r'^4\s+48\s+.*(?:(\d+)\s+QSFP)?.*-LC', str1, re.I|re.M).group(1) 
>>>

© Stack Overflow or respective owner

Related posts about python

Related posts about regex