How to reduce a data frame keeping the order for other columns
- by betabandido
I am trying to reduce a data frame using the max function on a given column. I would like to preserve other columns but keeping the values from the same rows where each maximum value was selected. An example will make this explanation easier.
Let us assume we have the following data frame:
dframe <- data.frame(list(BENCH=sort(rep(letters[1:4], 4)),
CFG=rep(1:4, 4),
VALUE=runif(4 * 4)
))
This gives me:
BENCH CFG VALUE
1 a 1 0.98828096
2 a 2 0.19630597
3 a 3 0.83539540
4 a 4 0.90988296
5 b 1 0.01191147
6 b 2 0.35164194
7 b 3 0.55094787
8 b 4 0.20744004
9 c 1 0.49864470
10 c 2 0.77845408
11 c 3 0.25278871
12 c 4 0.23440847
13 d 1 0.29795494
14 d 2 0.91766057
15 d 3 0.68044728
16 d 4 0.18448748
Now, I want to reduce the data in order to select the maximum VALUE for each different BENCH:
aggregate(VALUE ~ BENCH, dframe, FUN=max)
This gives me the expected result:
BENCH VALUE
1 a 0.9882810
2 b 0.5509479
3 c 0.7784541
4 d 0.9176606
Next, I tried to preserve other columns:
aggregate(cbind(VALUE, CFG) ~ BENCH, dframe, FUN=max)
This reduction returns:
BENCH VALUE CFG
1 a 0.9882810 4
2 b 0.5509479 4
3 c 0.7784541 4
4 d 0.9176606 4
Both VALUE and CFG are reduced using max function. But this is not what I want. For instance, in this example I would like to obtain:
BENCH VALUE CFG
1 a 0.9882810 1
2 b 0.5509479 3
3 c 0.7784541 2
4 d 0.9176606 2
where CFG is not reduced, but it just keeps the value associated to the maximum VALUE for each different BENCH.
How could I change my reduction in order to obtain the last result shown?