How can I collapse a dataframe by some variables, taking mean across others
Posted
by Alex Holcombe
on Stack Overflow
See other posts from Stack Overflow
or by Alex Holcombe
Published on 2010-04-01T04:49:36Z
Indexed on
2010/04/01
4:53 UTC
Read the original article
Hit count: 214
I need to summarize a data frame by some variables, ignoring the others. This is sometimes referred to as collapsing. E.g. if I have a dataframe like this: Widget Type Energy egg 1 20 egg 2 30 jap 3 50 jap 1 60
Then collapsing by Widget, with Energy the dependent variable, Energy~Widget, would yield Widget Energy egg 25 jap 55
In Excel the closest functionality might be "Pivot tables" and I've worked out how to do it in python (http://alexholcombe.wordpress.com/2009/01/26/summarizing-data-by-combinations-of-variables-with-python/), and here's an example with R using doBy library to do something very related (http://www.mail-archive.com/[email protected]/msg02643.html), but is there an easy way to do the above? And even better is there anything built into the ggplot2 library to create plots that collapse across some variables?
© Stack Overflow or respective owner