Remove Empty Rows of Data Frame in R (2 Examples)

 

In this R programming tutorial you’ll learn how to delete rows where all data cells are empty.

The tutorial distinguishes between empty in a sense of an empty character string (i.e. “”) and empty in a sense of missing values (i.e. NA).

Table of contents:

Let’s dive into it:

 

Example 1: Removing Rows with Only Empty Cells

This Example illustrates how to delete rows where all cells are empty (i.e. “”). First, we need to create some example data with empty rows:

data1 <- data.frame(x1 = c("1", "", "2", "", "3"),  # Create data with empty cells
                    x2 = c("a", "", "b", "c", "d"))
data1                                               # Print data with empty cells
#   x1 x2
# 1  1  a
# 2      
# 3  2  b
# 4     c
# 5  3  d

The previous output of the RStudio console shows the structure of our data. It contains five rows and two columns. Some of the data cells are empty.

If we want to extract rows where all cells are empty, we can use a combination of the apply and all function as shown below:

data1[!apply(data1 == "", 1, all), ]                # Remove rows with only empty cells
#   x1 x2
# 1  1  a
# 3  2  b
# 4     c
# 5  3  d

Compare this output with the original data. Row 2 was removed because it contained empty cells in all variables.

 

Example 2: Removing Rows with Only NA Values

In this Example, I’ll illustrate how to delete rows where all cells are NA. Again, we have to create some example data first:

data2 <- data.frame(x1 = c(1, NA, 2, NA, 3),        # Create data with NAs
                    x2 = c("a", NA, "b", "c", "d"))
data2                                               # Print data with NAs
#   x1   x2
# 1  1    a
# 2 NA <NA>
# 3  2    b
# 4 NA    c
# 5  3    d

Now, we can use the rowSums, is.na, and ncol functions to exclude only-NA rows from our data:

data2[rowSums(is.na(data2)) != ncol(data2), ]       # Remove rows with only NAs
#   x1 x2
# 1  1  a
# 3  2  b
# 4 NA  c
# 5  3  d

As you can see, the second row was deleted.

 

Video & Further Resources

In case you need further info on the contents of this article, I can recommend to have a look at the following video which I have published on my YouTube channel. I show the R programming syntax of this tutorial in the video.

 

 

In addition, I can recommend to read the related RStudio articles of this website.

 

To summarize: You learned in this tutorial how to remove rows with only empty cells in the R programming language. Let me know in the comments section below, if you have any further questions.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


2 Comments. Leave new

  • I tried command data2[rowSums(is.na(data2)) != ncol(data2), ] but I got an output in my console as below
    data2[rowSums(is.na(data2)) != ncol(data2), ]
    x1 x2
    1 1 a
    2 NA NA
    3 2 b
    4 NA c
    5 3 d

    Reply
    • Hey Atul,

      Thanks for the comment!

      Could you check if the NA values in x2 are real NA values and not “NA” values (i.e. a character string consisting of the letters N and A)?

      Regards,

      Joachim

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top