Convert Factor to Dummy Indicator Variables for Every Level in R (Example)
This page explains how to expand a factor column to dummy variables for each factor level in the R programming language.
The content of the tutorial is structured as follows:
- Creation of Example Data
- Example: Converting Factor to 1/0 Dummy Indicator
- Video, Further Resources & Summary
Here’s the step-by-step process:
Creation of Example Data
In the example of this R programming tutorial, we’ll use the following data frame in R:
data <- data.frame(x1 = c("a", "b", "a", "XXX", "C", "b", "abc"), # Create example data x2 = 1, x3 = 2) data # Print example data # x1 x2 x3 # 1 a 1 2 # 2 b 1 2 # 3 a 1 2 # 4 XXX 1 2 # 5 C 1 2 # 6 b 1 2 # 7 abc 1 2
Our example data consists of seven rows and three columns. The first column, i.e. the variable x1, is a factor with five different factor levels.
Example: Converting Factor to 1/0 Dummy Indicator
If we want to expand our data frame so that every factor level of x1 is represented in a dummy column, we can use the model.matrix function as shown below:
model.matrix( ~ x1 - 1, data) # Convert to dummies # x1a x1abc x1b x1C x1XXX # 1 1 0 0 0 0 # 2 0 0 1 0 0 # 3 1 0 0 0 0 # 4 0 0 0 0 1 # 5 0 0 0 1 0 # 6 0 0 1 0 0 # 7 0 1 0 0 0 # attr(,"assign") # [1] 1 1 1 1 1 # attr(,"contrasts") # attr(,"contrasts")$x1 # [1] "contr.treatment"
As you can see based on the output of the RStudio console, the output of the previous R syntax is a dummy matrix representing our factor variable x1.
If we want to merge these dummies to our original data frame, we can use the following R programming code:
data_dummy <- data.frame(data[ , ! colnames(data) %in% "x1"], # Create dummy data model.matrix( ~ x1 - 1, data)) data_dummy # Print dummy data # x2 x3 x1a x1abc x1b x1C x1XXX # 1 1 2 1 0 0 0 0 # 2 1 2 0 0 1 0 0 # 3 1 2 1 0 0 0 0 # 4 1 2 0 0 0 0 1 # 5 1 2 0 0 0 1 0 # 6 1 2 0 0 1 0 0 # 7 1 2 0 1 0 0 0
The final output consists of the variables of our original data frame (except x1) plus the dummy variables that are reflecting the factor levels of x1.
Video, Further Resources & Summary
Do you need further info on the R codes of the present article? Then you might watch the following video of my YouTube channel. I’m explaining the contents of this page in the video:
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Furthermore, you could read the related posts on my homepage:
- Reorder Levels of Factor without Changing Order of Values
- recode & recode_factor R Functions of dplyr Package
- Convert Factor to Character
- Convert Character to Factor in R
- Warning: invalid factor level, NA generated
- droplevels Function in R
- The R Programming Language
At this point of the article you should have learned how to automatically expand a factor column into binary dummies in R programming. Tell me about it in the comments, if you have additional questions. In addition, please subscribe to my email newsletter to get updates on new posts.
Statistics Globe Newsletter