Search Results

Search found 1 results on 1 pages for 'user3640982'.

Page 1/1 | 1 

  • Sampling Duplicates

    - by user3640982
    I have a dataset from which I need to sample. It is set up with an ID field and a year field. I want every record from the most current year and then I want the most current ID's but sampled from every 3rd year going back. The data is ordered by year. For example ID<-rep(1:3, 5) Year<-rep(c(1,2,3,4,5),each=3) df<-data.frame(ID,Year) ID Year 1 1 1 2 2 1 3 3 1 4 1 2 5 2 2 6 3 2 7 1 3 8 2 3 9 3 3 10 1 4 11 2 4 12 3 4 13 1 5 14 2 5 15 3 5 So from this example, I would want to return ID Year 1 1 1 2 2 1 3 3 1 4 1 4 5 2 4 6 3 4 I'm thinking that some combination of duplicated() and which() should get what I want, but the problem is duplicated() just tells if it has been repeated; it doesn't say which record is being repeated. which(duplicated(df$ID)) [1] 4 5 6 7 8 9 10 11 12 13 14 15 This a problem since not every ID exists in every year. Any help would be appreciated. Thanks, Eric

    Read the article

1