building median for each string in IList<IDictionary<string, double>>
- by Oliver
Actually i have a little brain bender and and just can't find the right direction to get it to work:
Given is an IList<IDictionary<string, double>> and it is filled as followed:
Name|Value
----+-----
"x" | 3.8
"y" | 4.2
"z" | 1.5
----+-----
"x" | 7.2
"y" | 2.9
"z" | 1.3
----+-----
... | ...
To fill this up with some random data i used the following methods:
var list = CreateRandomPoints(new string[] { "x", "y", "z" }, 20);
This will work as followed:
private IList<IDictionary<string, double>> CreateRandomPoints(string[] variableNames, int count)
{
var list = new List<IDictionary<string, double>>(count);
list.AddRange(CreateRandomPoints(variableNames).Take(count));
return list;
}
private IEnumerable<IDictionary<string, double>> CreateRandomPoints(string[] variableNames)
{
while (true)
yield return CreateRandomLine(variableNames);
}
private IDictionary<string, double> CreateRandomLine(string[] variableNames)
{
var dict = new Dictionary<string, double>(variableNames.Length);
foreach (var variable in variableNames)
{
dict.Add(variable, _Rand.NextDouble() * 10);
}
return dict;
}
Also i can say that it is already ensured that every Dictionary within the list contains the same keys (but from list to list the names and count of the keys can change).
So that's what i got. Now to the things i need:
I'd like to get the median (or any other math aggregate operation) of each Key within all the dictionaries, so that my function to call would look something like:
IDictionary<string, double> GetMedianOfRows(this IList<IDictionary<string, double>> list)
The best would be to give some kind of aggregate operation as a parameter to the function to make it more generic (don't know if the func has the correct parameters, but should imagine what i'd like to do):
private IDictionary<string, double> Aggregate(this IList<IDictionary<string, double>> list, Func<IEnumerable<double>, double> aggregator)
Also my actual biggest problem is to do the job with a single iteration over the list, cause if within the list are 20 variables with 1000 values i don't like to iterate 20 times over the list. Instead i would go one time over the list and compute all twenty variables at once (The biggest advantage of doing it that way would be to use this also as IEnumerable<T> on any part of the list in a later step).
So here is the code i already got:
public static IDictionary<string, double> GetMedianOfRows(this IList<IDictionary<string, double>> list)
{
//Check of parameters is left out!
//Get first item for initialization of result dictionary
var firstItem = list[0];
//Create result dictionary and fill in variable names
var dict = new Dictionary<string, double>(firstItem.Count);
//Iterate over the whole list
foreach (IDictionary<string, double> row in list)
{
//Iterate over each key/value pair within the list
foreach (var kvp in row)
{
//How to determine median of all values?
}
}
return dict;
}
Just to be sure here is a little explanation about the Median.