Separating code logic from the actual data structures. Best practices?
- by Patrick
I have an application that loads lots of data into memory (this is because it needs to perform some mathematical simulation on big data sets). This data comes from several database tables, that all refer to each other.
The consistency rules on the data are rather complex, and looking up all the relevant data requires quite some hashes and other additional data structures on the data.
Problem is that this data may also be changed interactively by the user in a dialog. When the user presses the OK button, I want to perform all the checks to see that he didn't introduce inconsistencies in the data. In practice all the data needs to be checked at once, so I cannot update my data set incrementally and perform the checks one by one.
However, all the checking code work on the actual data set loaded in memory, and use the hashing and other data structures. This means I have to do the following:
Take the user's changes from the dialog
Apply them to the big data set
Perform the checks on the big data set
Undo all the changes if the checks fail
I don't like this solution since other threads are also continuously using the data set, and I don't want to halt them while performing the checks. Also, the undo means that the old situation needs to be put aside, which is also not possible.
An alternative is to separate the checking code from the data set (and let it work on explicitly given data, e.g. coming from the dialog) but this means that the checking code cannot use hashing and other additional data structures, because they only work on the big data set, making the checks much slower.
What is a good practice to check user's changes on complex data before applying them to the 'application's' data set?