Behavior of <- NULL on lists versus data.frames for removing data
Posted
by
Ananda Mahto
on Stack Overflow
See other posts from Stack Overflow
or by Ananda Mahto
Published on 2013-10-17T18:49:34Z
Indexed on
2013/10/17
21:54 UTC
Read the original article
Hit count: 256
r
|data.frame
Many R users eventually figure out lots of ways to remove elements from their data. One way is to use NULL
, particularly when you want to do something like drop a column from a data.frame
or drop an element from a list
.
Eventually, a user comes across a situation where they want to drop several columns from a data.frame
at once, and they hit upon <- list(NULL)
as the solution (since using <- NULL
will result in an error).
A data.frame
is a special type of list
, so it wouldn't be too tough to imagine that the approaches for removing items from a list
should be the same as removing columns from a data.frame
. However, they produce different results, as can be seen in the example below.
## Make some small data--two data.frames and two lists
cars1 <- cars2 <- head(mtcars)[1:4]
cars3 <- cars4 <- as.list(cars2)
## Demonstration that the `list(NULL)` approach works
cars1[c("mpg", "cyl")] <- list(NULL)
cars1
# disp hp
# Mazda RX4 160 110
# Mazda RX4 Wag 160 110
# Datsun 710 108 93
# Hornet 4 Drive 258 110
# Hornet Sportabout 360 175
# Valiant 225 105
## Demonstration that simply using `NULL` does not work
cars2[c("mpg", "cyl")] <- NULL
# Error in `[<-.data.frame`(`*tmp*`, c("mpg", "cyl"), value = NULL) :
# replacement has 0 items, need 12
Switch to applying the same concept to a list
, and compare the difference in behavior.
## Does not fully drop the items, but sets them to `NULL`
cars3[c("mpg", "cyl")] <- list(NULL)
# $mpg
# NULL
#
# $cyl
# NULL
#
# $disp
# [1] 160 160 108 258 360 225
#
# $hp
# [1] 110 110 93 110 175 105
## *Does* drop the `list` items while this would
## have produced an error with a `data.frame`
cars4[c("mpg", "cyl")] <- NULL
# $disp
# [1] 160 160 108 258 360 225
#
# $hp
# [1] 110 110 93 110 175 105
The main questions I have are, if a data.frame
is a list
, why does it behave so differently in this scenario? Is there a foolproof way of knowing when an element will be dropped, when it will produce an error, and when it will simply be given a NULL
value? Or do we depend on trial-and-error for this?
© Stack Overflow or respective owner