Subset Data Frame Rows by Logical Condition in R (5 Examples)

In this tutorial you’ll learn how to subset rows of a data frame based on a logical condition in the R programming language.

Table of contents:

Here’s the step-by-step process.

Creation of Example Data

In the examples of this R tutorial, I’ll use the following data frame:

data <- data.frame(x1 = c(3, 7, 1, 8, 5),                    # Create example data
                   x2 = letters[1:5],
                   group = c("g1", "g2", "g1", "g3", "g1"))
data                                                         # Print example data
# x1 x2 group
#  3  a    g1
#  7  b    g2
#  1  c    g1
#  8  d    g3
#  5  e    g1

Our example data contains five rows and three columns. The column “group” will be used to filter our data.

Example 1: Subset Rows with ==

In Example 1, we’ll filter the rows of our data with the == operator. Have a look at the following R code:

data[data$group == "g1", ]                                   # Subset rows with ==
# x1 x2 group
#  3  a    g1
#  1  c    g1
#  5  e    g1

We selected only rows where the group column is equal to “g1”. We did this by specifying data$group == “g1” before a comma within squared parentheses.

Example 2: Subset Rows with !=

We can also subset our data the other way around (compared to Example 1). The following R code selects only rows where the group column is unequal to “g1”. We can do this based on the != operator:

data[data$group != "g1", ]                                   # Subset rows with !=
# x1 x2 group
#  7  b    g2
#  8  d    g3

Example 3: Subset Rows with %in%

We can also use the %in% operator to filter data by a logical vector. The %in% operator is especially helpful, when we want to use multiple conditions. In the following R syntax, we retain rows where the group column is equal to “g1” OR “g3”:

data[data$group %in% c("g1", "g3"), ]                        # Subset rows with %in%
# x1 x2 group
#  3  a    g1
#  1  c    g1
#  8  d    g3
#  5  e    g1

Example 4: Subset Rows with subset Function

Base R also provides the subset() function for the filtering of rows by a logical vector. Consider the following R code:

subset(data, group == "g1")                                  # Apply subset function
# x1 x2 group
#  3  a    g1
#  1  c    g1
#  5  e    g1

The output is the same as in Example 1, but this time we used the subset function by specifying the name of our data frame and the logical criteria within the function.

Example 5: Subset Rows with filter Function [dplyr Package]

We can also use the dplyr package to extract rows of our data. First, we need to install and load the package to R:

install.packages("dplyr")                                    # Install dplyr package
library("dplyr")                                             # Load dplyr package

Now, we can use the filter function of the dplyr package as follows:

filter(data, group == "g1")                                  # Apply filter function
# x1 x2 group
#  3  a    g1
#  1  c    g1
#  5  e    g1

Compare the R syntax of Example 4 and 5. The subset and filter functions are very similar.

Video & Further Resources

Would you like to learn more about the subsetting of rows? Then you may have a look at the following video of my YouTube channel. In the video, I illustrate the R programming code of this post in a live session:

Furthermore, you might have a look at the related articles on this website.

To summarize: This article explained how to return rows according to a matching criterion (e.g. conditioning on and ID or a factor variable) in the R programming language. Please let me know in the comments, if you have further questions. Furthermore, please subscribe to my email newsletter to receive regular updates on the newest tutorials.

6 Comments. Leave new

Jeff Norriss
July 6, 2022 4:41 am

What if I wanted all columns but only rows where X1 = 1, 3 or 7?

Reply
- Joachim
  July 8, 2022 7:03 pm
  
  Hey Jeff,
  
  Have you tried the code of Example 3 of this tutorial? You might exchange “g1” and “g3” by your values.
  
  Regards,
  Joachim
  
  Reply
Britta
January 24, 2023 3:38 am

Hi Joachim! Thank for the tutorial. I’m trying to filter rows that contain “(1)” in column but all the lines of code you explained return rows that contain either “1” or “()”. How can I specify that I need it in the exact order of “(1)”?
Thanks!

Reply
- Cansu (Statistics Globe)
  January 24, 2023 11:27 am
  Hello Britta,
  
  It is strange, shouldn’t be the case. Can you share your code with us? I changed the sample data a bit to adapt it to your case and used the first method given in the tutorial. It worked for me as it was supposed to, see below.
  data <- data.frame(x1 = c(3, 7, 1, 8, 5), # Create example data x2 = letters[1:5], group = c("(1)", "1", "(1)", "()", "(1)")) data # > data # x1 x2 group # 1 3 a (1) # 2 7 b 1 # 3 1 c (1) # 4 8 d () # 5 5 e (1) data2<-data[data$group == "(1)", ] data2 # x1 x2 group # 1 3 a (1) # 3 1 c (1) # 5 5 e (1)
  Regards,
  Cansu
  Reply
  - Britta
    January 25, 2023 12:52 am
    
    Hi Cansu,
    
    I tried different packages, this was my result. In my case, I only want the rows that contain (1), so 6(1) and 5(1).
    
    data <- data.frame(x1 = c(3, 7, 1, 8, 5), # Create example data
    x2 = letters[1:5],
    group = c("6(1)", "(2)", "1(2)", "1(3)", "5(1)"))
    
    d2 % filter(grepl(‘(1)’, group))
    d2<-data[data$group == "(1)", ]
    
    Both codes give the same result. I did manage to solve my problem though with the following code:
    
    d2 <- data[grepl("\$1\$",data$group),]
    
    Reply
    - Cansu (Statistics Globe)
      January 25, 2023 8:43 am
      
      Hello Brita,
      
      Ah yes, it is a different story when the data is like yours. That’s why it is important to know the exact structure of the dataset and the code. I am glad that you solved it. In case of any further questions, we are here to help. Don’t hesitate to contact us.
      
      Regards,
      Cansu
      
      Reply