Conditionally Remove Row from Data Frame in R (3 Examples) | How to Delete Rows

 

This page explains how to conditionally delete rows from a data frame in R programming.

The article will consist of this:

Let’s do this.

 

Creation of Example Data

In the examples of this R programming tutorial, we’ll use the following data frame as basement:

data <- data.frame(x1 = 1:5,                  # Create example data
                   x2 = letters[1:5],
                   x3 = "x")
data                                          # Print example data
#   x1 x2 x3
# 1  1  a  x
# 2  2  b  x
# 3  3  c  x
# 4  4  d  x
# 5  5  e  x

Our example data contains five rows and three columns.

 

Example 1: Remove Row Based on Single Condition

If we want to delete one or multiple rows conditionally, we can use the following R code:

data[data$x1 != 2, ]                          # Remove row based on condition
#   x1 x2 x3
# 1  1  a  x
# 3  3  c  x
# 4  4  d  x
# 5  5  e  x

The previous R syntax removed each row from our data frame, which fulfilled the condition data$x1 != 2 (i.e. the second row).

In this example, we used only one logical condition. However, we can also remove rows according to multiple conditions and that’s what I’m going to show you next!

 

Example 2: Remove Row Based on Multiple Conditions

We can remove rows based on multiple conditions by using the &- or the |-operator. Have a look at the following R code:

data[data$x1 != 2 & data$x2 != "e", ]         # Multiple conditions
#   x1 x2 x3
# 1  1  a  x
# 3  3  c  x
# 4  4  d  x

As you can see based on the output of the RStudio console, the previous R syntax deleted two rows according to the two logical conditions data$x1 != 2 & data$x2 != “e”.

 

Example 3: Remove Row with subset function

Alternatively to Examples 1 and 2, we can use the subset function:

subset(data, data$x1 != 2 & data$x2 != "e")   # Apply subset function
#   x1 x2 x3
# 1  1  a  x
# 3  3  c  x
# 4  4  d  x

The resulting output is the same as in Example 2, since we used the same condition. However, this time we used the subset command instead of square brackets. Which of these options you prefer, is a matter of taste!

 

Video & Further Resources

Do you need more info on the content of this tutorial? Then you may want to have a look at the following video of my YouTube channel. I’m illustrating the R codes of this tutorial in the video:

 

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube Content Consent Button Thumbnail

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

 

In addition, you may have a look at the related articles of my homepage. A selection of related articles is shown below:

 

To summarize: In this tutorial you learned how to exclude specific rows from a data table or matrix in the R programming language. Please let me know in the comments, in case you have further questions.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


4 Comments. Leave new

  • best website ever for learning R

    Reply
  • Hi Mathias,

    This is very helpful, thank you. I have a slightly different data layout and was wondering if you have any input. I have 5 rows (representing 5 conditions) for each participant and I want to remove all of them based on performance in one of these conditions. So I would like to say something like “if accuracy is smaller than 80% in condition x, remove all rows for this participant”. Is there a way of doing this? Thank you in advance.

    Reply
    • Hello Danai,

      You can use the following:

      data<-data.frame(
        A = c(5, 2, 3, 2, 4),
        B = c(4, 5, 6, 5, 6),
        C = c(7, 8, 9, 10, 12)
      )
      data
       
      row.names(data)<-c("cond1", "cond2", "cond3", "cond4", "cond5")
      data
      #       A B  C
      # cond1 5 4  7
      # cond2 2 5  8
      # cond3 3 6  9
      # cond4 2 5 10
      # cond5 4 6 12
       
      data_filt1 <- data[, data[1, ] != 5]
      data_filt1
      #       B  C
      # cond1 4  7
      # cond2 5  8
      # cond3 6  9
      # cond4 5 10
      # cond5 6 12

      However, in my opinion, the conventional layout is more intuitive and useful for model building and using the functions of dplyr package. You can simply transpose your data and employ the method shown in this tutorial. See the example below.

      data_t<-as.data.frame(t(data))
      data_t
      #   cond1 cond2 cond3 cond4 cond5
      # A     5     2     3     2     4
      # B     4     5     6     5     6
      # C     7     8     9    10    12

      Regards,
      Cansu

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top