Select Data Frame Columns by Logical Condition in R (2 Examples)
This page illustrates how to extract particular data frame columns based on a logical condition in the R programming language.
The content is structured as follows:
Let’s do this:
Creating Exemplifying Data
First, we’ll need to create some data that we can use in the following examples:
data <- data.frame(x1 = 1:5, # Create example data y1 = letters[1:5], x2 = "x", x3 = 9:5, y2 = 7) data # Print example data # x1 y1 x2 x3 y2 # 1 1 a x 9 7 # 2 2 b x 8 7 # 3 3 c x 7 7 # 4 4 d x 6 7 # 5 5 e x 5 7
The previous output of the RStudio console shows the structure of our example data: It has five rows and five columns. Some of the variable names start with x and some of the variable names start with y.
Example 1: Extract Data Frame Variables by Logical Condition Using grepl() Function
In Example 1, I’ll explain how to select certain columns based on a logical condition using the grepl function. Have a look at the following R code:
data_new1 <- data[ , grepl("x", colnames(data))] # Extract by logical data_new1 # Print updated data # x1 x2 x3 # 1 1 x 9 # 2 2 x 8 # 3 3 x 7 # 4 4 x 6 # 5 5 x 5
As you can see, the previous R syntax created a new data frame called data_new1 that consists only of columns with an x in their name.
Example 2: Extract Data Frame Variables by Logical Condition Using select() & starts_with() Functions of dplyr Package
In Example 2, I’ll illustrate how to subset data frame columns whose names match a specific prefix condition.
For this, we’ll use the dplyr add-on package. First, we have to install and load the dplyr package:
install.packages("dplyr") # Install dplyr package library("dplyr") # Load dplyr
Now, we can use the select and starts_with functions to extract only columns starting with an x:
data_new2 <- data %>% # Using dplyr functions select(starts_with("x")) data_new2 # Print updated data # x1 x2 x3 # 1 1 x 9 # 2 2 x 8 # 3 3 x 7 # 4 4 x 6 # 5 5 x 5
As you can see, the retained columns are exactly the same as in Example 1.
Video & Further Resources
I have recently released a video on my YouTube channel, which illustrates the R programming code of this tutorial. You can find the video below:
In addition, you may want to have a look at some of the related tutorials on my homepage. You can find a selection of tutorials below.
- Subset Data Frame Rows by Logical Condition
- Select Only Numeric Columns from Data Frame in R
- R Programming Tutorials
Summary: In this tutorial, I explained how to keep only variables where a logical condition is TRUE in the R programming language. Let me know in the comments section, if you have additional questions. Besides that, don’t forget to subscribe to my email newsletter in order to get updates on new articles.