Extracting email addresses in an html block in ruby/rails

Posted by corroded on Stack Overflow See other posts from Stack Overflow or by corroded
Published on 2010-05-06T14:56:30Z Indexed on 2010/05/06 15:48 UTC
Read the original article Hit count: 482

Filed under:

regex

I am creating a parser that wards off against spamming and harvesting of emails from a block of text that comes from tinyMCE (so it may or may not have html tags in it)

I've tried regexes and so far this has been successful:

/\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b/i

problem is, i need to ignore all email addresses with mailto hrefs. for example:

<a href="mailto:[email protected]">[email protected]</a>

should only return the second email add.

To get a background of what im doing, im reversing the email addresses in a block so the above example would look like this:

<a href="mailto:[email protected]">moc.liam@tset</a>

problem with my current regex is that it also replaces the one in href. Is there a way for me to do this with a single regex? Or do i have to check for one then the other? Is there a way for me to do this just by using gsub or do I have to use some nokogiri/hpricot magicks and whatnot to parse the mailtos? Thanks in advance!

Here were my references btw:

so.com/questions/504860/extract-email-addresses-from-a-block-of-text

so.com/questions/1376149/regexp-for-extracting-a-mailto-address

im also testing using this:

http://rubular.com/

edit

here's my current helper code:

def email_obfuscator(text)
  text.gsub(/\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b/i) { |m|
    m = "<span class='anti-spam'>#{m.reverse}</span>"
  }
end

which results in this:

<a target="_self" href="mailto:<span class='anti-spam'>moc.liamg@tset</span>"><span class="anti-spam">moc.liamg@tset</span></a>

Developer IT

Extracting email addresses in an html block in ruby/rails - Developer IT

Extracting email addresses in an html block in ruby/rails

ruby

ruby-on-rails

email-integration

html-parsing

regex

Related posts about ruby

Setting up Rails to work with sqlserver

marshal data too short!!!

Sinatra and XML POST request

how to change ruby path from /usr/bin/ruby to /usr/local/bin/ruby

strange bundler error: tar_input.rb:49:in `initialize': not in gzip format (Zlib::GzipFile::Error) o

Related posts about ruby-on-rails

Ruby on Rails - How can I start? [closed]

Ruby on rails: Image downloads with Authentication/Authorization/Time outs

DES3 decryption in Ruby on Rails

Ruby on Rails deployment, on "thin" server with lot of attachments

Apply Behavior Driven Development to Ruby on Rails with Rspec

Categories cloud