javascript RegEx hashtag matching #foo and #foo-fåäö but not http://this.is/no#hashtag

Posted by Simon B. on Stack Overflow See other posts from Stack Overflow or by Simon B.
Published on 2010-04-06T22:51:39Z Indexed on 2010/04/06 22:53 UTC
Read the original article Hit count: 451

Filed under:
|

Currently we're using javascript new RegExp('#[^,#=!\s][^,#=!\s]*') (see [1]) and it mostly works, except that it also matches URLs with anchors like http://this.is/no#hashtag and also we'd rather avoid matching foo#bar

Some attempts have been made with look-ahead but it doesn't seem to work, or that I just don't get it.

With the below source text:

#public #writable #kommentarer-till-beta -- all these should be matched
Verkligen #bra jobbat! T ex #kommentarer till #artiklar och #blogginlägg, kool. -- mixed within text
http://this.is/no#hashtag -- problem
xxy#bar      -- We'd prefer not matching this one, and...
#foo=bar   =foo#bar  -- we probably shouldn't match any of those either.
#foo,bar #foo;bar #foo-bar #foo:bar   -- We're flexible on whether these get matched in part or in full

.

We'd like to get below output:

(showing $ instead of <a class=tag href=.....>...</a> for readability reasons)

$ $ $ -- all these should be matched
Verkligen $ jobbat! T ex $ till $ och $, kool. -- mixed within text
http://this.is/no$ -- problem
xxy$      -- We'd prefer not matching this one, and...
$=bar   =foo$  -- we probably shouldn't match any of those either.
$,bar $ $ $   -- We're flexible on whether these get matched in part or in full

[1] http://github.com/ether/pad/blob/master/etherpad/src/plugins/twitterStyleTags/hooks.js

© Stack Overflow or respective owner

Related posts about JavaScript

Related posts about regex