javascript RegEx hashtag matching #foo and #foo-fåäö but not http://this.is/no#hashtag
Posted
by Simon B.
on Stack Overflow
See other posts from Stack Overflow
or by Simon B.
Published on 2010-04-06T22:51:39Z
Indexed on
2010/04/06
22:53 UTC
Read the original article
Hit count: 451
JavaScript
|regex
Currently we're using javascript new RegExp('#[^,#=!\s][^,#=!\s]*')
(see [1])
and it mostly works, except that it also matches URLs with anchors like http://this.is/no#hashtag and also we'd rather avoid matching foo#bar
Some attempts have been made with look-ahead but it doesn't seem to work, or that I just don't get it.
With the below source text:
#public #writable #kommentarer-till-beta -- all these should be matched
Verkligen #bra jobbat! T ex #kommentarer till #artiklar och #blogginlägg, kool. -- mixed within text
http://this.is/no#hashtag -- problem
xxy#bar -- We'd prefer not matching this one, and...
#foo=bar =foo#bar -- we probably shouldn't match any of those either.
#foo,bar #foo;bar #foo-bar #foo:bar -- We're flexible on whether these get matched in part or in full
.
We'd like to get below output:
(showing $ instead of <a class=tag href=.....>...</a> for readability reasons)
$ $ $ -- all these should be matched
Verkligen $ jobbat! T ex $ till $ och $, kool. -- mixed within text
http://this.is/no$ -- problem
xxy$ -- We'd prefer not matching this one, and...
$=bar =foo$ -- we probably shouldn't match any of those either.
$,bar $ $ $ -- We're flexible on whether these get matched in part or in full
[1] http://github.com/ether/pad/blob/master/etherpad/src/plugins/twitterStyleTags/hooks.js
© Stack Overflow or respective owner