Regex, encoding, and characters that look a like
Posted
by hack.augusto
on Stack Overflow
See other posts from Stack Overflow
or by hack.augusto
Published on 2010-03-24T16:54:31Z
Indexed on
2010/03/25
3:43 UTC
Read the original article
Hit count: 576
First, a brief example, let's say I have this "/[0-9]{2}°/"
regex and this text "24º"
. The text won't match, obviusly ... (?) really, it depends on the character encoding.
Here is my problem, I do not have control on which chars the user uses, so, I need to cover all possibilities in the regex /[0-9]{2}[°º]/
, or even better, assure that the text has only the chars I'm expecting °
. But I can't just remove the unknow chars otherwise the regex won't work, I need to change it to the chars that looks like it and I'm expecting. I have done this through a little function that maps the "look like" to "what I expect" and change it, the problem is, I have not covered all possibilities, for example, today I found a new "-", now we got three of them, just like latex =D -
--
---
,cool , but the regex didn't work.
Does anyone knows how I might solve this?
© Stack Overflow or respective owner