How do you match only valid roman numerals with a regular expression?
- by Daniel Magliola
Thinking about my other problem, i decided I can't even create a regular expression that will match roman numerals (let alone a context-free grammar that will generate them)
The problem is matching only valid roman numerals.
Eg, 990 is NOT "XM", it's "CMXC"
My problem in making the regex for this is that in order to allow or not allow certain characters, I need to look back.
Let's take thousands and hundreds, for example.
I can allow M{0,2}C?M (to allow for 900, 1000, 1900, 2000, 2900 and 3000). However, If the match is on CM, I can't allow following characters to be C or D (because I'm already at 900).
How can I express this in a regex?
If it's simply not expressible in a regex, is it expressible in a context-free grammar?
Thanks for any pointers!