Converting regex statment for sentance extraction to Ruby
Posted
by DavidP6
on Stack Overflow
See other posts from Stack Overflow
or by DavidP6
Published on 2010-05-01T08:00:11Z
Indexed on
2010/05/01
8:07 UTC
Read the original article
Hit count: 438
I found this regex statement on the wiki (http://en.wikipedia.org/wiki/Sentence_boundary_disambiguation) for Sentence boundary disambiguation, but am not able to use it in a Ruby split statment. I'm not too good with regex so maybe I am missing something? This is statment:
((?<=[a-z0-9)][.?!])|(?<=[a-z0-9][.?!]\"))(\s|\r\n)(?=\"?[A-Z])
and this is what I tried in Ruby, but no go:
text.split("((?<=[a-z0-9)][.?!])|(?<=[a-z0-9][.?!]\"))(\s|\r\n)(?=\"?[A-Z])")
© Stack Overflow or respective owner