Search Results

Search found 3 results on 1 pages for 'tapply'.

Page 1/1 | 1 

  • How can I make the output from tapply() into a data.frame

    - by James Thompson
    I have a data.frame in R that looks like this: score rms template aln_id description 1 -261.410 4.951 2f22A.pdb 2F22A_1 S_00001_0000002_0 2 -231.987 21.813 1wb9A.pdb 1WB9A_4 S_00002_0000002_0 3 -263.722 4.903 2f22A.pdb 2F22A_3 S_00003_0000002_0 4 -269.681 17.732 1wbbA.pdb 1WBBA_6 S_00004_0000002_0 5 -258.621 19.098 1rxqA.pdb 1RXQA_3 S_00005_0000002_0 6 -246.805 6.889 1rxqA.pdb 1RXQA_15 S_00006_0000002_0 7 -281.300 16.262 1wbdA.pdb 1WBDA_11 S_00007_0000002_0 8 -271.666 4.193 2f22A.pdb 2F22A_2 S_00008_0000002_0 9 -277.964 13.066 1wb9A.pdb 1WB9A_5 S_00009_0000002_0 10 -261.024 17.153 1yy9A.pdb 1YY9A_2 S_00001_0000003_0 I can calculate summary statistics on the data.frame like this: > tapply( d$score, d$template, mean ) 1rxqA.pdb 1wb9A.pdb 1wbbA.pdb 1wbdA.pdb 1yy9A.pdb 2f22A.pdb -252.7130 -254.9755 -269.6810 -281.3000 -261.0240 -265.5993 Is there an easy way that I coerce this output back into a data.frame? I'd like for it to have these two columns: d$template mean I love tapply, but right now I'm cutting and pasting the results from tapply into a text file and hacking it up a bit to get the summary statistics that I want with appropriate names. This feels very wrong, and I'd like to do something better!

    Read the article

  • using subset but old variables still left

    - by user2520852
    I am working with a data set, which is basically daily usage data (let's just say variable X and Y) by different cities (about 150 cities). I have created a subset of data for only specific cities, choosing just 3 of the 150 cities. Then when I do tapply by cities, I get means for 3 cities but also get NA for all other 147 cities that was in the data set. I am using the below coding df<-read.csv(...) df_sub<-subset(df,df$City==1|df$City==3|df$City==19) X_Breakdown<-tapply(X,df_sub$City, mean, na.rm=TRUE) Print(X_Breakdown) City 1 City 2 15 NA City 3 City 4 12 NA City 5 City 6 NA NA Hope you get the idea. I would like to get a dataset that only contains the 3 cities that I'm interested in. It seems that the set of variables is encoded in R, is there a way to fix this? Kindly advise. Thanks

    Read the article

  • Avoid the use of loops (for) with R

    - by albergali
    Hi, I'm working with R and I have a code like this: i<-1 j<-1 for (i in 1:10) for (j in 1:100) if (data[i] == paths[j,1]) cluster[i,4] <- paths[j,2] where : data is a vector with 100 rows and 1 column paths is a matrix with 100 rows and 5 columns cluster is a matrix with 100 rows and 5 columns My question is: how could I avoid the use of "for" loops to iterate through the matrix? I don't know whether apply functions (lapply, tapply...) are useful in this case. This is a problem when j=10000 for example, because execution time is very long. Thank you

    Read the article

1