Data denormalization and C# objects DB serialization
- by Robert Koritnik
I'm using a DB table with various different entities. This means that I can't have an arbitrary number of fields in it to save all kinds of different entities. I want instead save just the most important fields (dates, reference IDs - kind of foreign key to various other tables, most important text fields etc.) and an additional text field where I'd like to store more complete object data.
the most obvious solution would be to use XML strings and store those. The second most obvious choice would be JSON, that usually shorter and probably also faster to serialize/deserialize... And is probably also faster. But is it really? My objects also wouldn't need to be strictly serializable, because JsonSerializer is usually able to serialize anything. Even anonymous objects, that may as well be used here.
What would be the most optimal solution to solve this problem?
Additional info
My DB is highly normalised and I'm using Entity Framework, but for the purpose of having external super-fast fulltext search functionality I'm sacrificing a bit DB denormalisation. Just for the info I'm using SphinxSE on top of MySql. Sphinx would return row IDs that I would use to fast query my index optimised conglomerate table to get most important data from it much much faster than querying multiple tables all over my DB.
My table would have columns like:
RowID (auto increment)
EntityID (of the actual entity - but not directly related because this would have to point to different tables)
EntityType (so I would be able to get the actual entity if needed)
DateAdded (record timestamp when it's been added into this table)
Title
Metadata (serialized data related to particular entity type)
This table would be indexed with SPHINX indexer. When I would search for data using this indexer I would provide a series of EntityIDs and a limit date. Indexer would have to return a very limited paged amount of RowIDs ordered by DateAdded (descending). I would then just join these RowIDs to my table and get relevant results. So this won't actually be full text search but a filtering search. Getting RowIDs would be very fast this way and getting results back from the table would be much faster than comparing EntityIDs and DateAdded comparisons even though they would be properly indexed.