Design suggestion for expression tree evaluation with time-series data
- by Lirik
I have a (C#) genetic program that uses financial time-series data and it's currently working but I want to re-design the architecture to be more robust. My main goals are:
sequentially present the time-series data to the expression trees.
allow expression trees to access previous data rows when needed.
to optimize performance of the data access while evaluating the expression trees.
keep a common interface so various types of data can be used.
Here are the possible approaches I've thought about:
I can evaluate the expression tree by passing in a data row into the root node and let each child node use the same data row.
I can evaluate the expression tree by passing in the data row index and letting each node get the data row from a shared DataSet (currently I'm passing the row index and going to multiple synchronized arrays to get the data).
Hybrid: an immutable data set is accessible by all of the expression trees and each expression tree is evaluated by passing in a data row.
The benefit of the first approach is that the data row is being passed into the expression tree and there is no further query done on the data set (which should increase performance in a multithreaded environment). The drawback is that the expression tree does not have access to the rest of the data (in case some of the functions need to do calculations using previous data rows).
The benefit of the second approach is that the expression trees can access any data up to the latest data row, but unless I specify what that row is, I'll have to iterate through the rows and figure out which one is the last one.
The benefit of the hybrid is that it should generally perform better and still provide access to the earlier data. It supports two basic "views" of data: the latest row and the previous rows.
Do you guys know of any design patterns or do you have any tips that can help me build this type of system? Should I use a DataSet to hold and present the data, or are there more efficient ways to present rows of data while maintaining a simple interface?
FYI: All of my code is written in C#.