How to reduce a data frame keeping the order for other columns
Posted
by
betabandido
on Stack Overflow
See other posts from Stack Overflow
or by betabandido
Published on 2012-06-11T22:38:27Z
Indexed on
2012/06/11
22:39 UTC
Read the original article
Hit count: 178
r
|data.frame
I am trying to reduce a data frame using the max
function on a given column. I would like to preserve other columns but keeping the values from the same rows where each maximum value was selected. An example will make this explanation easier.
Let us assume we have the following data frame:
dframe <- data.frame(list(BENCH=sort(rep(letters[1:4], 4)),
CFG=rep(1:4, 4),
VALUE=runif(4 * 4)
))
This gives me:
BENCH CFG VALUE 1 a 1 0.98828096 2 a 2 0.19630597 3 a 3 0.83539540 4 a 4 0.90988296 5 b 1 0.01191147 6 b 2 0.35164194 7 b 3 0.55094787 8 b 4 0.20744004 9 c 1 0.49864470 10 c 2 0.77845408 11 c 3 0.25278871 12 c 4 0.23440847 13 d 1 0.29795494 14 d 2 0.91766057 15 d 3 0.68044728 16 d 4 0.18448748
Now, I want to reduce the data in order to select the maximum VALUE for each different BENCH:
aggregate(VALUE ~ BENCH, dframe, FUN=max)
This gives me the expected result:
BENCH VALUE 1 a 0.9882810 2 b 0.5509479 3 c 0.7784541 4 d 0.9176606
Next, I tried to preserve other columns:
aggregate(cbind(VALUE, CFG) ~ BENCH, dframe, FUN=max)
This reduction returns:
BENCH VALUE CFG 1 a 0.9882810 4 2 b 0.5509479 4 3 c 0.7784541 4 4 d 0.9176606 4
Both VALUE and CFG are reduced using max
function. But this is not what I want. For instance, in this example I would like to obtain:
BENCH VALUE CFG 1 a 0.9882810 1 2 b 0.5509479 3 3 c 0.7784541 2 4 d 0.9176606 2
where CFG is not reduced, but it just keeps the value associated to the maximum VALUE for each different BENCH.
How could I change my reduction in order to obtain the last result shown?
© Stack Overflow or respective owner