Recognize Dates In A String
- by Tim Scott
I want a class something like this:
public interface IDateRecognizer
{
DateTime[] Recognize(string s);
}
The dates might exist anywhere in the string and might be any format. For now, I could limit to U.S. culture formats. The dates would not be delimited in any way. They might have arbitrary amounts of whitespace between parts of the date. The ideas I have are:
ANTLR
Regex
Hand rolled
I have never used ANTLR, so I would be learning from scratch. I wonder if there are libraries or code samples out there that do something similar that could jump start me. Is ANTLR too heavy for such a narrow use?
I have used Regex a lot before, but I hate it for all the reasons that most people hate it.
I could certainly hand roll it but I'd rather not re-solve a solved problem.
Suggestions?
UPDATE: Here is an example. Given this input:
This is a date 11/3/63. Here is
another one: November 03, 1963; and
another one Nov 03, 63 and some
more (11/03/1963). The dates could be
in any U.S. format. They might have
dashes like 11-2-1963 or weird extra
whitespace inside like this:
Nov 3, 1963,
and even maybe the comma is missing
like [Nov 3 63] but that's an edge
case.
The output should be an array of seven DateTimes. Each date would be the same: 11/03/1963 00:00:00.