Remove NA Values in Only One Column of Data Frame in R (Example)

 

In this R tutorial you’ll learn how to omit NA values in only one specific data frame variable.

The content of the tutorial looks as follows:

It’s time to dive into the example!

 

Constructing Example Data

The first step is to create some example data:

data <- data.frame(x1 = 1:7,              # Create example data frame
                   x2 = c(NA, 3, NA, 5, 5, 7, 2),
                   x3 = c("a", NA, "b", NA, NA, "c", "d"))
data                                      # Print example data frame

 

table 1 data frame remove na values only one column data frame r

 

Table 1 shows the structure of our example data: It is composed of seven rows and three columns. The variables x2 and x3 both contain NA values.

If we would like to delete all rows where at least one NA value occurs, we might use the na.omit function:

data_all_NA <- na.omit(data)              # Remove NA values in all columns
data_all_NA                               # Print data frame subset without NA values

 

table 2 data frame remove na values only one column data frame r

 

However, in this tutorial I’ll explain how to remove only those rows where one specific variable contains a missing value.

Let’s do this!

 

Example: Omit NA Values in Only One Data Frame Column Using is.na() Function

In this example, I’ll explain how to delete rows in our data where a certain column contains an NA value.

To achieve this, we can use the is.na function as shown below:

data_col_NA <- data[!is.na(data$x2), ]    # Drop NAs in only one column
data_col_NA                               # Print data frame without NA in one column

 

table 3 data frame remove na values only one column data frame r

 

Table 3 shows the output of the previous R code: We have kept only rows where the column x2 was not missing.

 

Video & Further Resources

I have recently published a video on my YouTube channel, which shows the R codes of this tutorial. You can find the video below:

 

The YouTube video will be added soon.

 

It is important to note that the deletion of missing values can cause bias and the loss of many observations in your sample. For that reason, it might be a better choice to use missing data imputation methods in case you are facing missing values in your data. Please have a look at this tutorial for more details.

In addition, you might want to read the related tutorials on this website. You can find a selection of articles below.

 

In this tutorial, I have illustrated how to remove missing values in only one specific data frame column in the R programming language. Don’t hesitate to kindly let me know in the comments section, if you have any additional questions.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published.

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top