Create unique identifier for different row-groups

Posted by Max van der Heijden on Stack Overflow See other posts from Stack Overflow or by Max van der Heijden
Published on 2012-11-05T10:58:23Z Indexed on 2012/11/05 11:00 UTC
Read the original article Hit count: 225

Filed under:

sequences

I want to number certain combinations of row in a dataframe (which is ordered on ID and on Time)

tc <- textConnection('
id              time       end_yn   number
abc             10         0        1
abc             11         0        2
abc             12         1        3 
abc             13         0        1 
def             10         0        1
def             15         1        2
def             16         0        1
def             17         0        2
def             18         1        3
')

test <- read.table(tc, header=TRUE)

The goal is to create a new column ("journey_nr") that give a unique number to each row based on the journey it belongs to. Journeys are defined as a sequence of rows per id up until to end_yn == 1, also if end_ynnever becomes 1, the journey should also be numbered (see the expected outcome example). It is only possible to have end_yn == 0 journeys at the end of a collection of rows for an ID (as shown at row 4 for id 3). So either no end_yn == 1 has occured for that ID or that happened before the end_yn == 0-journey (see id == abc in the example).

I know how to number using the data.table package, but I do not know which columns to combine in order to get the expected outcome. I've searched the data.table-tag on SO, but could not find a similar problem.

Expected outcome:

id              time       end_yn   number    journey
abc             10         0        1         1
abc             11         0        2         1
abc             12         1        3         1
abc             13         0        1         2
def             10         0        1         3
def             15         1        2         3
def             16         0        1         4
def             17         0        2         4
def             18         1        3         4

Developer IT

Create unique identifier for different row-groups - Developer IT

Create unique identifier for different row-groups

r

data.table

sequences

Related posts about r

Related posts about data.table

How to obtain a random sub-datatable from another data table

Using list() to extract a data.table inside of a function

replacing data.frame element-wise operations with data.table (that used rowname)

Join and sum not compatible matrices through data.table

C# and NpgsqlDataAdapter returning a single string instead of a data table

Categories cloud