Convert Factor to Dummy Indicator Variables for Every Level in R (Example)
This page explains how to expand a factor column to dummy variables for each factor level in the R programming language.
The content of the tutorial is structured as follows:
- Creation of Example Data
- Example: Converting Factor to 1/0 Dummy Indicator
- Video, Further Resources & Summary
Here’s the step-by-step process:
Creation of Example Data
In the example of this R programming tutorial, we’ll use the following data frame in R:
data <- data.frame(x1 = c("a", "b", "a", "XXX", "C", "b", "abc"), # Create example data x2 = 1, x3 = 2) data # Print example data # x1 x2 x3 # 1 a 1 2 # 2 b 1 2 # 3 a 1 2 # 4 XXX 1 2 # 5 C 1 2 # 6 b 1 2 # 7 abc 1 2
Our example data consists of seven rows and three columns. The first column, i.e. the variable x1, is a factor with five different factor levels.
Example: Converting Factor to 1/0 Dummy Indicator
If we want to expand our data frame so that every factor level of x1 is represented in a dummy column, we can use the model.matrix function as shown below:
model.matrix( ~ x1 - 1, data) # Convert to dummies # x1a x1abc x1b x1C x1XXX # 1 1 0 0 0 0 # 2 0 0 1 0 0 # 3 1 0 0 0 0 # 4 0 0 0 0 1 # 5 0 0 0 1 0 # 6 0 0 1 0 0 # 7 0 1 0 0 0 # attr(,"assign") # [1] 1 1 1 1 1 # attr(,"contrasts") # attr(,"contrasts")$x1 # [1] "contr.treatment"
As you can see based on the output of the RStudio console, the output of the previous R syntax is a dummy matrix representing our factor variable x1.
If we want to merge these dummies to our original data frame, we can use the following R programming code:
data_dummy <- data.frame(data[ , ! colnames(data) %in% "x1"], # Create dummy data model.matrix( ~ x1 - 1, data)) data_dummy # Print dummy data # x2 x3 x1a x1abc x1b x1C x1XXX # 1 1 2 1 0 0 0 0 # 2 1 2 0 0 1 0 0 # 3 1 2 1 0 0 0 0 # 4 1 2 0 0 0 0 1 # 5 1 2 0 0 0 1 0 # 6 1 2 0 0 1 0 0 # 7 1 2 0 1 0 0 0
The final output consists of the variables of our original data frame (except x1) plus the dummy variables that are reflecting the factor levels of x1.
Video, Further Resources & Summary
Do you need further info on the R codes of the present article? Then you might watch the following video of my YouTube channel. I’m explaining the contents of this page in the video:
Furthermore, you could read the related posts on my homepage:
- Reorder Levels of Factor without Changing Order of Values
- recode & recode_factor R Functions of dplyr Package
- Convert Factor to Character
- Convert Character to Factor in R
- Warning: invalid factor level, NA generated
- droplevels Function in R
- The R Programming Language
At this point of the article you should have learned how to automatically expand a factor column into binary dummies in R programming. Tell me about it in the comments, if you have additional questions. In addition, please subscribe to my email newsletter to get updates on new posts.