Select only the first rows for each unique value of a column in R
Posted
by
dmvianna
on Stack Overflow
See other posts from Stack Overflow
or by dmvianna
Published on 2012-11-07T22:45:17Z
Indexed on
2012/11/07
23:00 UTC
Read the original article
Hit count: 187
From a dataframe like this
test <- data.frame('id'= rep(1:5,2), 'string'= LETTERS[1:10])
test <- test[order(test$id), ]
rownames(test) <- 1:10
> test
id string
1 1 A
2 1 F
3 2 B
4 2 G
5 3 C
6 3 H
7 4 D
8 4 I
9 5 E
10 5 J
I want to create a new one with the first appearance of each id / string pair. If sqldf accepted R code within it, the query could look like this:
res <- sqldf("select id, min(rownames(test)), string
from test
group by id, string")
> res
id string
1 1 A
3 2 B
5 3 C
7 4 D
9 5 E
Is there a solution short of creating a new column like
test$row <- rownames(test)
and running the same sqldf query with min(row)?
© Stack Overflow or respective owner