How to reduce a data frame keeping the order for other columns

Posted by betabandido on Stack Overflow See other posts from Stack Overflow or by betabandido
Published on 2012-06-11T22:38:27Z Indexed on 2012/06/11 22:39 UTC
Read the original article Hit count: 159

Filed under:
|

I am trying to reduce a data frame using the max function on a given column. I would like to preserve other columns but keeping the values from the same rows where each maximum value was selected. An example will make this explanation easier.

Let us assume we have the following data frame:

dframe <- data.frame(list(BENCH=sort(rep(letters[1:4], 4)),
                          CFG=rep(1:4, 4),
                          VALUE=runif(4 * 4)
                         ))

This gives me:

   BENCH CFG      VALUE
1      a   1 0.98828096
2      a   2 0.19630597
3      a   3 0.83539540
4      a   4 0.90988296
5      b   1 0.01191147
6      b   2 0.35164194
7      b   3 0.55094787
8      b   4 0.20744004
9      c   1 0.49864470
10     c   2 0.77845408
11     c   3 0.25278871
12     c   4 0.23440847
13     d   1 0.29795494
14     d   2 0.91766057
15     d   3 0.68044728
16     d   4 0.18448748

Now, I want to reduce the data in order to select the maximum VALUE for each different BENCH:

aggregate(VALUE ~ BENCH, dframe, FUN=max)

This gives me the expected result:

  BENCH     VALUE
1     a 0.9882810
2     b 0.5509479
3     c 0.7784541
4     d 0.9176606

Next, I tried to preserve other columns:

aggregate(cbind(VALUE, CFG) ~ BENCH, dframe, FUN=max)

This reduction returns:

  BENCH     VALUE CFG
1     a 0.9882810   4
2     b 0.5509479   4
3     c 0.7784541   4
4     d 0.9176606   4

Both VALUE and CFG are reduced using max function. But this is not what I want. For instance, in this example I would like to obtain:

  BENCH     VALUE CFG
1     a 0.9882810   1
2     b 0.5509479   3
3     c 0.7784541   2
4     d 0.9176606   2

where CFG is not reduced, but it just keeps the value associated to the maximum VALUE for each different BENCH.

How could I change my reduction in order to obtain the last result shown?

© Stack Overflow or respective owner

Related posts about r

    Related posts about data.frame