Subset Data Frame Rows by Logical Condition in R (5 Examples)

 

In this tutorial you’ll learn how to subset rows of a data frame based on a logical condition in the R programming language.

Table of contents:

Here’s the step-by-step process.

 

Creation of Example Data

In the examples of this R tutorial, I’ll use the following data frame:

data <- data.frame(x1 = c(3, 7, 1, 8, 5),                    # Create example data
                   x2 = letters[1:5],
                   group = c("g1", "g2", "g1", "g3", "g1"))
data                                                         # Print example data
# x1 x2 group
#  3  a    g1
#  7  b    g2
#  1  c    g1
#  8  d    g3
#  5  e    g1

Our example data contains five rows and three columns. The column “group” will be used to filter our data.

 

Example 1: Subset Rows with ==

In Example 1, we’ll filter the rows of our data with the == operator. Have a look at the following R code:

data[data$group == "g1", ]                                   # Subset rows with ==
# x1 x2 group
#  3  a    g1
#  1  c    g1
#  5  e    g1

We selected only rows where the group column is equal to “g1”. We did this by specifying data$group == “g1” before a comma within squared parentheses.

 

Example 2: Subset Rows with !=

We can also subset our data the other way around (compared to Example 1). The following R code selects only rows where the group column is unequal to “g1”. We can do this based on the != operator:

data[data$group != "g1", ]                                   # Subset rows with !=
# x1 x2 group
#  7  b    g2
#  8  d    g3

 

Example 3: Subset Rows with %in%

We can also use the %in% operator to filter data by a logical vector. The %in% operator is especially helpful, when we want to use multiple conditions. In the following R syntax, we retain rows where the group column is equal to “g1” OR “g3”:

data[data$group %in% c("g1", "g3"), ]                        # Subset rows with %in%
# x1 x2 group
#  3  a    g1
#  1  c    g1
#  8  d    g3
#  5  e    g1

 

Example 4: Subset Rows with subset Function

Base R also provides the subset() function for the filtering of rows by a logical vector. Consider the following R code:

subset(data, group == "g1")                                  # Apply subset function
# x1 x2 group
#  3  a    g1
#  1  c    g1
#  5  e    g1

The output is the same as in Example 1, but this time we used the subset function by specifying the name of our data frame and the logical condition within the function.

 

Example 5: Subset Rows with filter Function [dplyr Package]

We can also use the dplyr package to extract rows of our data. First, we need to install and load the package to R:

install.packages("dplyr")                                    # Install dplyr package
library("dplyr")                                             # Load dplyr package
 
Now, we can use the filter function of the dplyr package as follows:
 
filter(data, group == "g1")                                  # Apply filter function
# x1 x2 group
#  3  a    g1
#  1  c    g1
#  5  e    g1

Compare the R syntax of Example 4 and 5. The subset and filter functions are very similar.

 

Video & Further Resources

Would you like to learn more about the subsetting of rows? Then you may have a look at the following video of my YouTube channel. In the video, I illustrate the R programming code of this post in a live session:

 

The YouTube video will be added soon.

 

Furthermore, you might have a look at the related articles on this website.

 

To summarize: This article explained how to return rows according to a matching condition in the R programming language. Please let me know in the comments, if you have further questions. Furthermore, please subscribe to my email newsletter to receive regular updates on the newest tutorials.

 



Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top