Remove NA Values in Only One Column of Data Frame in R (Example)
In this R tutorial you’ll learn how to omit NA values in only one specific data frame variable.
The content of the tutorial looks as follows:
It’s time to dive into the example!
Constructing Example Data
The first step is to create some example data:
data <- data.frame(x1 = 1:7, # Create example data frame x2 = c(NA, 3, NA, 5, 5, 7, 2), x3 = c("a", NA, "b", NA, NA, "c", "d")) data # Print example data frame
Table 1 shows the structure of our example data: It is composed of seven rows and three columns. The variables x2 and x3 both contain NA values.
If we would like to delete all rows where at least one NA value occurs, we might use the na.omit function:
data_all_NA <- na.omit(data) # Remove NA values in all columns data_all_NA # Print data frame subset without NA values
However, in this tutorial I’ll explain how to remove only those rows where one specific variable contains a missing value.
Let’s do this!
Example: Omit NA Values in Only One Data Frame Column Using is.na() Function
In this example, I’ll explain how to delete rows in our data where a certain column contains an NA value.
To achieve this, we can use the is.na function as shown below:
data_col_NA <- data[!is.na(data$x2), ] # Drop NAs in only one column data_col_NA # Print data frame without NA in one column
Table 3 shows the output of the previous R code: We have kept only rows where the column x2 was not missing.
Video & Further Resources
I have recently published a video on my YouTube channel, which shows the R codes of this tutorial. You can find the video below:
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
It is important to note that the deletion of missing values can cause bias and the loss of many observations in your sample. For that reason, it might be a better choice to use missing data imputation methods in case you are facing missing values in your data. Please have a look at this tutorial for more details.
In addition, you might want to read the related tutorials on this website. You can find a selection of articles below.
- Remove All Whitespace in Each Data Frame Column
- Remove Negative Values from Vector & Data Frame
- Replace NA with 0 (10 Examples for Data Frame, Vector & Column)
- Remove NA Values from ggplot2 Plot in R
- R Programming Examples
In this tutorial, I have illustrated how to remove missing values in only one specific data frame column in the R programming language. Don’t hesitate to kindly let me know in the comments section, if you have any additional questions.
Statistics Globe Newsletter