building median for each string in IList<IDictionary<string, double>>

Posted by Oliver on Stack Overflow See other posts from Stack Overflow or by Oliver
Published on 2010-04-29T10:28:54Z Indexed on 2010/04/29 10:47 UTC
Read the original article Hit count: 675

Filed under:
|

Actually i have a little brain bender and and just can't find the right direction to get it to work:

Given is an IList<IDictionary<string, double>> and it is filled as followed:

Name|Value
----+-----
"x" | 3.8
"y" | 4.2
"z" | 1.5
----+-----
"x" | 7.2
"y" | 2.9
"z" | 1.3
----+-----
... | ...

To fill this up with some random data i used the following methods:

var list = CreateRandomPoints(new string[] { "x", "y", "z" }, 20);

This will work as followed:

private IList<IDictionary<string, double>> CreateRandomPoints(string[] variableNames, int count)
{
    var list = new List<IDictionary<string, double>>(count);
    list.AddRange(CreateRandomPoints(variableNames).Take(count));

    return list;
}

private IEnumerable<IDictionary<string, double>> CreateRandomPoints(string[] variableNames)
{
    while (true)
        yield return CreateRandomLine(variableNames);
}

private IDictionary<string, double> CreateRandomLine(string[] variableNames)
{
    var dict = new Dictionary<string, double>(variableNames.Length);

    foreach (var variable in variableNames)
    {
        dict.Add(variable, _Rand.NextDouble() * 10);
    }

    return dict;
}

Also i can say that it is already ensured that every Dictionary within the list contains the same keys (but from list to list the names and count of the keys can change).

So that's what i got. Now to the things i need:

I'd like to get the median (or any other math aggregate operation) of each Key within all the dictionaries, so that my function to call would look something like:

IDictionary<string, double> GetMedianOfRows(this IList<IDictionary<string, double>> list)

The best would be to give some kind of aggregate operation as a parameter to the function to make it more generic (don't know if the func has the correct parameters, but should imagine what i'd like to do):

private IDictionary<string, double> Aggregate(this IList<IDictionary<string, double>> list, Func<IEnumerable<double>, double> aggregator)

Also my actual biggest problem is to do the job with a single iteration over the list, cause if within the list are 20 variables with 1000 values i don't like to iterate 20 times over the list. Instead i would go one time over the list and compute all twenty variables at once (The biggest advantage of doing it that way would be to use this also as IEnumerable<T> on any part of the list in a later step).

So here is the code i already got:

public static IDictionary<string, double> GetMedianOfRows(this IList<IDictionary<string, double>> list)
{
    //Check of parameters is left out!

    //Get first item for initialization of result dictionary
    var firstItem = list[0];

    //Create result dictionary and fill in variable names
    var dict = new Dictionary<string, double>(firstItem.Count);

    //Iterate over the whole list
    foreach (IDictionary<string, double> row in list)
    {
        //Iterate over each key/value pair within the list
        foreach (var kvp in row)
        {
            //How to determine median of all values?
        }
    }

    return dict;
}

Just to be sure here is a little explanation about the Median.

© Stack Overflow or respective owner

Related posts about c#

Related posts about LINQ