regex in python, can this be improved upon?
Posted
by tipu
on Stack Overflow
See other posts from Stack Overflow
or by tipu
Published on 2010-06-02T19:31:05Z
Indexed on
2010/06/02
19:34 UTC
Read the original article
Hit count: 163
I have this piece of code that finds words that begin with @ or #,
p = re.findall(r'@\w+|#\w+', str)
Now what irks me about this is repeating \w+. I am sure there is a way to do something like
p = re.findall(r'(@|#)\w+', str)
That will produce the same result but it doesn't, it instead returns only #
and @
. How can that regex be changed so that I am not repeating the \w+
? This code comes close,
p = re.findall(r'((@|#)\w+)', str)
But it returns [('@many', '@'), ('@this', '@'), ('#tweet', '#')]
(notice the extra '@', '@', and '#'.
© Stack Overflow or respective owner