Posted By: Anonymous
I have a data.frame in R. I want to try two different conditions on two different columns, but I want these conditions to be inclusive. Therefore, I would like to use “OR” to combine the conditions. I have used the following syntax before with lot of success when I wanted to use the “AND” condition.
my.data.frame <- data[(data$V1 > 2) & (data$V2 < 4), ]
But I don’t know how to use an ‘OR’ in the above.
my.data.frame <- subset(data , V1 > 2 | V2 < 4)
An alternative solution that mimics the behavior of this function and would be more appropriate for inclusion within a function body:
new.data <- data[ which( data$V1 > 2 | data$V2 < 4) , ]
Some people criticize the use of
which as not needed, but it does prevent the
NA values from throwing back unwanted results. The equivalent (.i.e not returning NA-rows for any NA’s in V1 or V2) to the two options demonstrated above without the
which would be:
new.data <- data[ !is.na(data$V1 | data$V2) & ( data$V1 > 2 | data$V2 < 4) , ]
Note: I want to thank the anonymous contributor that attempted to fix the error in the code immediately above, a fix that got rejected by the moderators. There was actually an additional error that I noticed when I was correcting the first one. The conditional clause that checks for NA values needs to be first if it is to be handled as I intended, since …
> NA & 1  NA > 0 & NA  FALSE
Order of arguments may matter when using ‘&”.