Problem: I have an address field from an Access database which has been converted to Sql Server 2005. This field has everything all in one field. I need to parse out the individual sections of the address into their appropriate fields in a normalized table. I need to do this for approximately 4,000 records and it needs to be repeatable.
Here are the rules for this exercise:
1 - no whining about how this should have been separate fields in the first place, we are often confronted with less than ideal situations and have to make the best of them
2- for this post, use any language you want
3- feel free to play code golf
4 - Assume an address in the US (for now)
5 - assume that the input string will sometimes contain an addressee (the person being addressed) and/or a second street address (i.e. Suite B)
6 - states may be abbreviated
7 - zip code could be standard 5 digit or zip+4
8 - there are typos in some instances
UPDATE: In response to the questions posed, standards were not universally followed, I need need to store the individual values, not just geocode and errors means typo (corrected above)
Sample Data:
A. P. Croll & Son 2299 Lewes-Georgetown Hwy, Georgetown, DE 19947
11522 Shawnee Road, Greenwood DE 19950
144 Kings Highway, S.W. Dover, DE 19901
Intergrated Const. Services 2 Penns Way Suite 405 New Castle, DE 19720
Humes Realty 33 Bridle Ridge Court, Lewes, DE 19958
Nichols Excavation 2742 Pulaski Hwy Newark, DE 19711
2284 Bryn Zion Road, Smyrna, DE 19904
VEI Dover Crossroads, LLC 1500 Serpentine Road, Suite 100 Baltimore MD 21
580 North Dupont Highway Dover, DE 19901
P.O. Box 778 Dover, DE 19903