I am working on an application that requires the storage of location information such as city, state, zip code, latitude, and longitude. I would like to ensure:
Location data is accurate
Detroit, CA
Detroit IS NOT in California
Detroit, MI
Detroit IS in Michigan
Cities and states are spelled correctly
California not Calefornia
Detroit not Detriot
Cities and states are named consistently
Valid:
CA
Detroit
Invalid:
Cali
california
DET
d-town
The D
Also, since city/zip data is not guaranteed to be static, updating this data in a normalized fashion could be difficult, whereas it could be implemented as a de facto location if it is denormalized.
A couple thoughts that come to mind:
A collection of reference tables that store a list of all states and the most common cities and zip codes that can grow over time. It would search the database for an exact or similar match and recommend corrections.
Use some sort of service to validate the location data before it is stored in the database.
Is it possible to fulfill these requirements without normalization, and if so, should I denormalize this data?