Create unique identifier for different row-groups

Posted by Max van der Heijden on Stack Overflow See other posts from Stack Overflow or by Max van der Heijden
Published on 2012-11-05T10:58:23Z Indexed on 2012/11/05 11:00 UTC
Read the original article Hit count: 201

Filed under:
|
|

I want to number certain combinations of row in a dataframe (which is ordered on ID and on Time)

tc <- textConnection('
id              time       end_yn   number
abc             10         0        1
abc             11         0        2
abc             12         1        3 
abc             13         0        1 
def             10         0        1
def             15         1        2
def             16         0        1
def             17         0        2
def             18         1        3
')

test <- read.table(tc, header=TRUE)

The goal is to create a new column ("journey_nr") that give a unique number to each row based on the journey it belongs to. Journeys are defined as a sequence of rows per id up until to end_yn == 1, also if end_ynnever becomes 1, the journey should also be numbered (see the expected outcome example). It is only possible to have end_yn == 0 journeys at the end of a collection of rows for an ID (as shown at row 4 for id 3). So either no end_yn == 1 has occured for that ID or that happened before the end_yn == 0-journey (see id == abc in the example).

I know how to number using the data.table package, but I do not know which columns to combine in order to get the expected outcome. I've searched the data.table-tag on SO, but could not find a similar problem.

Expected outcome:

id              time       end_yn   number    journey
abc             10         0        1         1
abc             11         0        2         1
abc             12         1        3         1
abc             13         0        1         2
def             10         0        1         3
def             15         1        2         3
def             16         0        1         4
def             17         0        2         4
def             18         1        3         4

© Stack Overflow or respective owner

Related posts about r

    Related posts about data.table