Group Factor Levels in R (Example)
In this R tutorial you’ll learn how to regroup factor levels of a factor vector or column.
The tutorial consists of this information:
You’re here for the answer, so let’s get straight to the R syntax:
Construction of Example Data
As the first step, we’ll have to create some exemplifying data:
x <- factor(c("a", "b", "a", "c", "c")) # Create example factor x # Print example factor # [1] a b a c c # Levels: a b c
The previous output of the RStudio console shows that our example data is a factor vector with three different factor levels.
Example: Combine Factor Levels Using levels() Function
In this example, I’ll illustrate how to merge two factor levels into one category using the levels function in R.
Have a look at the following R code:
x_new <- x # Duplicate example factor levels(x_new) <- c("a", "b", "b") # Regroup factor levels x_new # Print updated factor # [1] a b a b b # Levels: a b
As you can see based on the previous output of the RStudio console, we have created a new factor vector called x_new that contains only two factor levels. The factor level “c” was replaced by / combined with the factor level “b”.
Video & Further Resources
Do you need more info on the contents of this tutorial? Then you might watch the following video of my YouTube channel. In the video, I’m explaining the R codes of this article:
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Furthermore, you might have a look at the related articles of my homepage. Please find a selection of articles below.
- droplevels R Example
- Reorder Levels of Factor without Changing Order of Values
- Keep Unused Factor Levels in ggplot2 Barplot
- Introduction to R Programming
In this R tutorial you have learned how to collapse factor levels of a factor vector.
Please note that it would be possible to apply the same R syntax to group a factor column of a data frame instead of a vector as well.
Let me know in the comments section, if you have any further questions.
Statistics Globe Newsletter