Counting in R data.table
Posted
by
Simon Z.
on Stack Overflow
See other posts from Stack Overflow
or by Simon Z.
Published on 2013-11-08T21:50:38Z
Indexed on
2013/11/08
21:53 UTC
Read the original article
Hit count: 213
r
|data.table
I have the following data.table
set.seed(1)
DT <- data.table(VAL = sample(c(1, 2, 3), 10, replace = TRUE))
VAL
1: 1
2: 2
3: 2
4: 3
5: 1
6: 3
7: 3
8: 2
9: 2
10: 1
Now I want to to perform two tasks:
- Count the occurrences of numbers in
VAL
. - Count within all rows with the same value
VAL
(first, second, third occurrence)
At the end I want the result
VAL COUNT IDX
1: 1 3 1
2: 2 4 1
3: 2 4 2
4: 3 3 1
5: 1 3 2
6: 3 3 2
7: 3 3 3
8: 2 4 3
9: 2 4 4
10: 1 3 3
where COUNT
defines task 1. and IDX
task 2.
I tried to work with which
and length
using .I
:
dt[, list(COUNT = length(VAL == VAL[.I]),
IDX = which(which(VAL == VAL[.I]) == .I))]
but this does not work as .I
refers to a vector with the index, so I guess one must use .I[]
. Though inside .I[]
I again face the problem, that I do not have the row index and I do know (from reading data.table
FAQ and following the posts here) that looping through rows should be avoided if possible.
So, what's the data.table
way?
© Stack Overflow or respective owner