Replace Blank by NA in R (2 Examples)

 

This post shows how to set blank values in a data frame to NA in the R programming language.

Table of contents:

Let’s start right away…

 

Creating Example Data

Consider the following example data:

data <- data.frame(x1 = c("", 5, 7, " ", 1, 1),      # Create data frame
                   x2 = c(3, 2, 1, "", 2, ""),
                   x3 = c("", " ", letters[1:4]))
data                                                 # Print data frame

 

table 1 data frame replace blank na

 

Table 1 shows the structure of our exemplifying data: It consists of six lines and three variables.

Some of the data cells in our data frame are blank. This is a typical problem when importing external data files to R (e.g. CSV, TXT, or XLSX files).

So how can we replace these empty values in our data frame variables?

 

Example 1: Set Blank to NA in Data Frame

In Example 1, I’ll illustrate how to replace empty cells by NA (i.e. Not Available or missing values) using a logical condition based on the == operator.

Have a look at the following R code and the resulting data frame:

data_new1 <- data                                    # Duplicate data frame
data_new1[data_new1 == ""] <- NA                     # Replace blank by NA
data_new1                                            # Print updated data frame

 

table 2 data frame replace blank na

 

Table 2 shows the output of the previously shown R syntax: A data frame where blank values were replaced by NA.

You may already have noticed that some of the data cells still look empty. The reason for this is that some of the data cells in our input data frame are not entirely empty, but they contain only a space (i.e. ” “).

In the next example I’ll explain how to set spaces to NA as well.

 

Example 2: Set Blank & Space to NA in Data Frame

In this example, I’ll illustrate how to substitute blanks AND spaces with NA. For this, we can use the | operator to specify another logical condition:

data_new2 <- data                                    # Duplicate data frame
data_new2[data_new2 == "" | data_new2 == " "] <- NA  # Replace blank & space by NA
data_new2                                            # Print updated data frame

 

table 3 data frame replace blank na

 

In Table 3 it is shown that we have managed to construct a data frame without blanks or spaces.

 

Video, Further Resources & Summary

If you need more explanations on the R programming codes of this tutorial, I recommend having a look at the following video of my YouTube channel. I illustrate the topics of this article in the video:

 

The YouTube video will be added soon.

 

Furthermore, you might want to read some of the related articles of this website. You can find some articles on similar topics such as missing data, merging, and vectors below:

 

In this R article you have learned how to exchange blanks by NA. Let me know in the comments below, in case you have further questions. Furthermore, don’t forget to subscribe to my email newsletter to receive regular updates on the newest articles.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top