R Help (2 Examples) – Warning: invalid factor level, NA generated
The tutorial contains the following:
- Example 1: Invalid Factor Level in Vector
- Example 2: Invalid Factor Level in Data Frame
- Video & Further Resources
It’s time to dive into the examples.
Example 1: Invalid Factor Level in Vector
The warning message “invalid factor level, NA generated” can appear in a vector or a data frame. In the first example, I’ll show you how to handle this warning in vectors.
Consider the following factor vector in R:
x <- factor(c("A", "B", "A", "C")) # Create example vector x # Print example vector #  A B A C # Levels: A B C
Our example vector contains four elements and the three factor levels A, B, and C. To illustrate the warning message, I’ll replicate our example vector first:
x1 <- x # Replicate example vector
Now, let’s assume that we want to add a new element to our vector.
Let’s see what happens in practice:
x1 <- "D" # Try to add element to vector # Warning message: # In `[<-.factor`(`*tmp*`, 5, value = "D") : # invalid factor level, NA generated
The RStudio console returns the warning “invalid factor level, NA generated”. Also, our new vector contains an NA value instead of the new value that we wanted to append:
x1 # Print example vector #  A B A C <NA> # Levels: A B C
The reason for this is that we tried to append the value “D”, i.e. a value that is not a factor level of our example vector.
If we want to append a value to a factor, which is not a factor level of this factor, we need to convert the vector to the character class first. Let’s do this in practice.
First, we replicate our original vector…
x2 <- x # Replicate example vector
…then we convert this vector to the character class…
x2 <- as.character(x2) # Convert factor to character
…then we concatenate our vector with the new value “D” without a warning message…
x2 <- "D" # Add element to vector
…and then we convert our combined vector back to the factor class:
x2 <- as.factor(x2) # Convert character to factor
Our final output looks as follows:
x2 # New factor level was created #  A B A C D # Levels: A B C D
A factor vector with a new factor level D.
Example 2: Invalid Factor Level in Data Frame
The warning message “invalid factor level, NA generated” often appears in data frames. Consider the following example data:
data1 <- data.frame(col1 = c("A", "B", "A", "C"), # Create example data col2 = "x", stringsAsFactors = TRUE) data1 # Print example data # col1 col2 # A x # B x # A x # C x
If we try to replace an element of this data frame with a new value, we receive the warning “invalid factor level, NA generated”:
data1$col1 <- "D" # Try to replace column value # Warning message: # In `[<-.factor`(`*tmp*`, 4, value = c(1L, 2L, 1L, NA)) : # invalid factor level, NA generated
At this point we have two options: Either we convert our data frame column to the character class manually; Or we create our example data frame with the specification stringsAsFactors = FALSE:
data2 <- data.frame(col1 = c("A", "B", "A", "C"), # Specify stringsAsFactors = FALSE col2 = "x", stringsAsFactors = FALSE)
The data.frame function automatically converts characters to factors. By specifying stringsAsFactors = FALSE, we retain the character class.
Now, we can easily replace the elements of our data frame with new values:
data2$col1 <- "D" # Replace column value
Video & Further Resources
I have recently released a video on my YouTube channel, which shows the topics of this page. You can find the video below.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
In addition, you might want to read the other articles of this website. You can find a selection of interesting articles here:
- message, warning & stop Functions in R
- droplevels R Function
- How to Convert a Factor to Numeric in R
- Convert Data Frame Column to Numeric
- The R Programming Language
In summary: On this page, I illustrated how to deal with the warning message invalid factor level, NA generated in the R programming language. If you have additional questions and/or comments, tell me about it in the comments.
Statistics Globe Newsletter