How do you match only valid roman numerals with a regular expression?
Posted
by Daniel Magliola
on Stack Overflow
See other posts from Stack Overflow
or by Daniel Magliola
Published on 2008-11-06T01:29:25Z
Indexed on
2010/05/12
2:24 UTC
Read the original article
Hit count: 387
Thinking about my other problem, i decided I can't even create a regular expression that will match roman numerals (let alone a context-free grammar that will generate them)
The problem is matching only valid roman numerals. Eg, 990 is NOT "XM", it's "CMXC"
My problem in making the regex for this is that in order to allow or not allow certain characters, I need to look back. Let's take thousands and hundreds, for example.
I can allow M{0,2}C?M (to allow for 900, 1000, 1900, 2000, 2900 and 3000). However, If the match is on CM, I can't allow following characters to be C or D (because I'm already at 900).
How can I express this in a regex?
If it's simply not expressible in a regex, is it expressible in a context-free grammar?
Thanks for any pointers!
© Stack Overflow or respective owner