Subset Data Frame Rows by Logical Condition in R (5 Examples)

 

In this tutorial you’ll learn how to subset rows of a data frame based on a logical condition in the R programming language.

Table of contents:

Here’s the step-by-step process.

 

Creation of Example Data

In the examples of this R tutorial, I’ll use the following data frame:

data <- data.frame(x1 = c(3, 7, 1, 8, 5),                    # Create example data
                   x2 = letters[1:5],
                   group = c("g1", "g2", "g1", "g3", "g1"))
data                                                         # Print example data
# x1 x2 group
#  3  a    g1
#  7  b    g2
#  1  c    g1
#  8  d    g3
#  5  e    g1

Our example data contains five rows and three columns. The column “group” will be used to filter our data.

 

Example 1: Subset Rows with ==

In Example 1, we’ll filter the rows of our data with the == operator. Have a look at the following R code:

data[data$group == "g1", ]                                   # Subset rows with ==
# x1 x2 group
#  3  a    g1
#  1  c    g1
#  5  e    g1

We selected only rows where the group column is equal to “g1”. We did this by specifying data$group == “g1” before a comma within squared parentheses.

 

Example 2: Subset Rows with !=

We can also subset our data the other way around (compared to Example 1). The following R code selects only rows where the group column is unequal to “g1”. We can do this based on the != operator:

data[data$group != "g1", ]                                   # Subset rows with !=
# x1 x2 group
#  7  b    g2
#  8  d    g3

 

Example 3: Subset Rows with %in%

We can also use the %in% operator to filter data by a logical vector. The %in% operator is especially helpful, when we want to use multiple conditions. In the following R syntax, we retain rows where the group column is equal to “g1” OR “g3”:

data[data$group %in% c("g1", "g3"), ]                        # Subset rows with %in%
# x1 x2 group
#  3  a    g1
#  1  c    g1
#  8  d    g3
#  5  e    g1

 

Example 4: Subset Rows with subset Function

Base R also provides the subset() function for the filtering of rows by a logical vector. Consider the following R code:

subset(data, group == "g1")                                  # Apply subset function
# x1 x2 group
#  3  a    g1
#  1  c    g1
#  5  e    g1

The output is the same as in Example 1, but this time we used the subset function by specifying the name of our data frame and the logical criteria within the function.

 

Example 5: Subset Rows with filter Function [dplyr Package]

We can also use the dplyr package to extract rows of our data. First, we need to install and load the package to R:

install.packages("dplyr")                                    # Install dplyr package
library("dplyr")                                             # Load dplyr package

Now, we can use the filter function of the dplyr package as follows:

filter(data, group == "g1")                                  # Apply filter function
# x1 x2 group
#  3  a    g1
#  1  c    g1
#  5  e    g1

Compare the R syntax of Example 4 and 5. The subset and filter functions are very similar.

 

Video & Further Resources

Would you like to learn more about the subsetting of rows? Then you may have a look at the following video of my YouTube channel. In the video, I illustrate the R programming code of this post in a live session:

 

 

Furthermore, you might have a look at the related articles on this website.

 

To summarize: This article explained how to return rows according to a matching criterion (e.g. conditioning on and ID or a factor variable) in the R programming language. Please let me know in the comments, if you have further questions. Furthermore, please subscribe to my email newsletter to receive regular updates on the newest tutorials.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


6 Comments. Leave new

  • Jeff Norriss
    July 6, 2022 4:41 am

    What if I wanted all columns but only rows where X1 = 1, 3 or 7?

    Reply
    • Hey Jeff,

      Have you tried the code of Example 3 of this tutorial? You might exchange “g1” and “g3” by your values.

      Regards,
      Joachim

      Reply
  • Hi Joachim! Thank for the tutorial. I’m trying to filter rows that contain “(1)” in column but all the lines of code you explained return rows that contain either “1” or “()”. How can I specify that I need it in the exact order of “(1)”?
    Thanks!

    Reply
    • Hello Britta,

      It is strange, shouldn’t be the case. Can you share your code with us? I changed the sample data a bit to adapt it to your case and used the first method given in the tutorial. It worked for me as it was supposed to, see below.

      data <- data.frame(x1 = c(3, 7, 1, 8, 5),                    # Create example data
                         x2 = letters[1:5],
                         group = c("(1)", "1", "(1)", "()", "(1)"))
      data
      # > data
      #    x1 x2 group
      # 1  3  a   (1)
      # 2  7  b     1
      # 3  1  c   (1)
      # 4  8  d    ()
      # 5  5  e   (1)
       
      data2<-data[data$group == "(1)", ]   
      data2
      #   x1 x2 group
      # 1  3  a   (1)
      # 3  1  c   (1)
      # 5  5  e   (1)

      Regards,
      Cansu

      Reply
      • Hi Cansu,

        I tried different packages, this was my result. In my case, I only want the rows that contain (1), so 6(1) and 5(1).

        data <- data.frame(x1 = c(3, 7, 1, 8, 5), # Create example data
        x2 = letters[1:5],
        group = c("6(1)", "(2)", "1(2)", "1(3)", "5(1)"))

        d2 % filter(grepl(‘(1)’, group))
        d2<-data[data$group == "(1)", ]

        Both codes give the same result. I did manage to solve my problem though with the following code:

        d2 <- data[grepl("\\(1\\)",data$group),]

        Reply
        • Hello Brita,

          Ah yes, it is a different story when the data is like yours. That’s why it is important to know the exact structure of the dataset and the code. I am glad that you solved it. In case of any further questions, we are here to help. Don’t hesitate to contact us.

          Regards,
          Cansu

          Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top