# Select Data Frame Columns by Logical Condition in R (2 Examples)

This page illustrates how to extract particular data frame columns based on a logical condition in the R programming language.

The content is structured as follows:

Letâ€™s do this:

## Creating Exemplifying Data

First, weâ€™ll need to create some data that we can use in the following examples:

```data <- data.frame(x1 = 1:5,                      # Create example data
y1 = letters[1:5],
x2 = "x",
x3 = 9:5,
y2 = 7)
data                                              # Print example data
#   x1 y1 x2 x3 y2
# 1  1  a  x  9  7
# 2  2  b  x  8  7
# 3  3  c  x  7  7
# 4  4  d  x  6  7
# 5  5  e  x  5  7```

The previous output of the RStudio console shows the structure of our example data: It has five rows and five columns. Some of the variable names start with x and some of the variable names start with y.

## Example 1: Extract Data Frame Variables by Logical Condition Using grepl() Function

In Example 1, Iâ€™ll explain how to select certain columns based on a logical condition using the grepl function. Have a look at the following R code:

```data_new1 <- data[ , grepl("x", colnames(data))]  # Extract by logical
data_new1                                         # Print updated data
#   x1 x2 x3
# 1  1  x  9
# 2  2  x  8
# 3  3  x  7
# 4  4  x  6
# 5  5  x  5```

As you can see, the previous R syntax created a new data frame called data_new1 that consists only of columns with an x in their name.

## Example 2: Extract Data Frame Variables by Logical Condition Using select() & starts_with() Functions of dplyr Package

In Example 2, Iâ€™ll illustrate how to subset data frame columns whose names match a specific prefix condition.

For this, weâ€™ll use the dplyr add-on package. First, we have to install and load the dplyr package:

```install.packages("dplyr")                         # Install dplyr package

Now, we can use the select and starts_with functions to extract only columns starting with an x:

```data_new2 <- data %>%                             # Using dplyr functions
select(starts_with("x"))
data_new2                                         # Print updated data
#   x1 x2 x3
# 1  1  x  9
# 2  2  x  8
# 3  3  x  7
# 4  4  x  6
# 5  5  x  5```

As you can see, the retained columns are exactly the same as in Example 1.

## Video & Further Resources

I have recently released a video on my YouTube channel, which illustrates the R programming code of this tutorial. You can find the video below:

In addition, you may want to have a look at some of the related tutorials on my homepage. You can find a selection of tutorials below.

Summary: In this tutorial, I explained how to keep only variables where a logical condition is TRUE in the R programming language. Let me know in the comments section, if you have additional questions. Besides that, donâ€™t forget to subscribe to my email newsletter in order to get updates on new articles.

Subscribe to the Statistics Globe Newsletter