win32 ruby1.9 regexp and cyrillic string
Posted
by scriper
on Stack Overflow
See other posts from Stack Overflow
or by scriper
Published on 2010-04-27T14:06:18Z
Indexed on
2010/04/27
14:13 UTC
Read the original article
Hit count: 280
ruby
#coding: utf-8
str2 = "asdf????????"
p str2.encoding #<Encoding:UTF-8>
p str2.scan /\p{Cyrillic}/ #found all cyrillic charachters
str2.gsub!(/\w/u,'') #removes only latin characters
puts str2
The question is why \w ignore cyrillic characters?
I have installed latest ruby package from http://rubyinstaller.org/. Here is my output of ruby -v ruby 1.9.1p378 (2010-01-10 revision 26273) [i386-mingw32]
As far as i know 1.9 oniguruma regular expression library has full support for unicode characters.
© Stack Overflow or respective owner