I want to filter out all values of var3 < 5 while keeping at least one occurrence of each value of var1.
> foo <- data.frame(var1= c(1, 1, 2, 3, 3, 4, 4, 5), var2=c(9, 5, 13, 9, 12, 11, 13, 9), var3=c(6, 8, 3, 6, 4, 7, 2, 9))
> foo
var1 var2 var3
1 1 9 6
2 1 5 8
3 2 13 3
4 3 9 6
5 3 12 4
6 4 11 7
7 4 13 2
8 5 9 9
subset(foo, (foo$var3>=5)) would remove row 3, 5 and 7 and I would have lost var1==2.
I want to remove the row if there is another value of var1 that fulfills the condition foo$var3 = 5. See row 5.
I want to keep the row, assiging NA to var2 and var3 if all occurrences of a value var1 do not fulfill the condition foo$var3 = 5.
This is the result I expect:
var1 var2 var3
1 1 9 6
2 1 5 8
3 2 NA NA
4 3 9 6
6 4 11 7
8 5 9 9
This is the closest I got:
> foo$var3[ foo$var3 < 5 ] = NA
> foo$var2[ is.na(foo$var3) ] = NA
> foo
var1 var2 var3
1 1 9 6
2 1 5 8
3 2 NA NA
4 3 9 6
5 3 NA NA
6 4 11 7
7 4 NA NA
8 5 9 9
So I guess I just need to know how to conditionally remove the row.