Remove All-NA Columns from Data Frame in R (Example)

 

In this article you’ll learn how to drop columns where all values are NA in the R programming language.

The content of the tutorial is structured like this:

Here’s how to do it!

 

Example Data

In the example of this R tutorial, we’ll use the following data frame:

data <- data.frame(x1 = 1:5,                            # Create example data
                   x2 = letters[1:5],
                   x3 = NA,
                   x4 = c(NA, 5, 3, NA, 5),
                   x5 = NA)
data                                                    # Print example data
#   x1 x2 x3 x4 x5
# 1  1  a NA NA NA
# 2  2  b NA  5 NA
# 3  3  c NA  3 NA
# 4  4  d NA NA NA
# 5  5  e NA  5 NA

As you can see based on the RStudio console output, our example data consists of five variables x1, x2, x3, x4, and x5. Some of these variables contain only missing values (i.e. NA – Not Available). Other columns contain some or none NA values.

In the following example, we are going to remove columns where all values are NA…

 

Example: Drop Variables where All Values are Missing

If we want to delete variables with only-NA values, we can use a combination of the colSums, is.na, and nrow functions. Have a look at the following R syntax:

data_new <- data[ , colSums(is.na(data)) < nrow(data)]  # Remove rows with NA only
data_new                                                # Print updated data
#   x1 x2 x4
# 1  1  a NA
# 2  2  b  5
# 3  3  c  3
# 4  4  d NA
# 5  5  e  5

The output of the previous R code is a new data frame with the name data_new. As you can see, this data frame consists of only three columns. The all-NA variables x3 and x5 were executed.

 

Video & Further Resources

I have recently published a video on my YouTube channel, which shows the R programming code of this tutorial. You can find the video below:

 

 

In addition, I can recommend to read the other posts of this website. You can find a selection of articles about the handling of data frames below.

 

This article showed how to drop multiple data frame columns without any valid values in the R programming language. If you have further questions, please let me know in the comments section.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top