Remove All-NA Columns from Data Frame in R (Example)
In this article you’ll learn how to drop columns where all values are NA in the R programming language.
The content of the tutorial is structured like this:
Here’s how to do it!
Example Data
In the example of this R tutorial, we’ll use the following data frame:
data <- data.frame(x1 = 1:5, # Create example data x2 = letters[1:5], x3 = NA, x4 = c(NA, 5, 3, NA, 5), x5 = NA) data # Print example data # x1 x2 x3 x4 x5 # 1 1 a NA NA NA # 2 2 b NA 5 NA # 3 3 c NA 3 NA # 4 4 d NA NA NA # 5 5 e NA 5 NA
As you can see based on the RStudio console output, our example data consists of five variables x1, x2, x3, x4, and x5. Some of these variables contain only missing values (i.e. NA – Not Available). Other columns contain some or none NA values.
In the following example, we are going to remove columns where all values are NA…
Example: Drop Variables where All Values are Missing
If we want to delete variables with only-NA values, we can use a combination of the colSums, is.na, and nrow functions. Have a look at the following R syntax:
data_new <- data[ , colSums(is.na(data)) < nrow(data)] # Remove rows with NA only data_new # Print updated data # x1 x2 x4 # 1 1 a NA # 2 2 b 5 # 3 3 c 3 # 4 4 d NA # 5 5 e 5
The output of the previous R code is a new data frame with the name data_new. As you can see, this data frame consists of only three columns. The all-NA variables x3 and x5 were executed.
Video & Further Resources
I have recently published a video on my YouTube channel, which shows the R programming code of this tutorial. You can find the video below:
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
In addition, I can recommend to read the other posts of this website. You can find a selection of articles about the handling of data frames below.
- The is.na Function in R
- The colSums Function in R
- The nrow Function in R
- Remove Data Frame Columns by Name
- The R Programming Language
This article showed how to drop multiple data frame columns without any valid values in the R programming language. If you have further questions, please let me know in the comments section.
Statistics Globe Newsletter