Create Dummy Variable in R (3 Examples)
This tutorial shows how to generate dummy variables in the R programming language.
The tutorial will consist of the following content blocks:
So let’s just jump right in…
Example 1: Convert Character String with Two Values to Dummy Using ifelse() Function
In Example 1, I’ll explain how to convert a character vector (or a factor) that contains two different values to a dummy indicator. Let’s first create such a character vector in R:
vec1 <- c("yes", "no", "no", "yes", "no") # Create input vector vec1 # Print input vector # [1] "yes" "no" "no" "yes" "no" |
vec1 <- c("yes", "no", "no", "yes", "no") # Create input vector vec1 # Print input vector # [1] "yes" "no" "no" "yes" "no"
The previous RStudio console output shows the structure of our example vector. It consists of five character strings that are either “yes” or “no”.
We can now convert this input vector to a numeric dummy indicator using the ifelse function:
dummy1 <- ifelse(vec1 == "yes", 1, 0) # Applying ifelse function dummy1 # Print dummy # [1] 1 0 0 1 0 |
dummy1 <- ifelse(vec1 == "yes", 1, 0) # Applying ifelse function dummy1 # Print dummy # [1] 1 0 0 1 0
Our dummy vector is equal to 1 in case the input vector was equal to “yes”; and equal to 0 in case the input vector was equal to “no”.
Example 2: Convert Categorical Variable to Dummy Matrix Using model.matrix() Function
Example 2 explains how to create a dummy matrix based on an input vector with multiple values (i.e. a categorical variable). Let’s create another example vector in R:
vec2 <- c("yes", "no", "maybe", "yes", "yes", "maybe") # Create input vector vec2 # Print input vector # [1] "yes" "no" "maybe" "yes" "yes" "maybe" |
vec2 <- c("yes", "no", "maybe", "yes", "yes", "maybe") # Create input vector vec2 # Print input vector # [1] "yes" "no" "maybe" "yes" "yes" "maybe"
Our example vector consists of six character strings that are either “yes”, “no”, or “maybe”.
We can convert this vector to a dummy matrix using the model.matrix function as shown below. Note that we are also using the as.data.frame function, since this makes the output a bit prettier and easier to read (in my opinion).
dummy2 <- as.data.frame(model.matrix(~ vec2 - 1)) # Applying model.matrix function dummy2 # Print dummy # vec2maybe vec2no vec2yes # 1 0 0 1 # 2 0 1 0 # 3 1 0 0 # 4 0 0 1 # 5 0 0 1 # 6 1 0 0 |
dummy2 <- as.data.frame(model.matrix(~ vec2 - 1)) # Applying model.matrix function dummy2 # Print dummy # vec2maybe vec2no vec2yes # 1 0 0 1 # 2 0 1 0 # 3 1 0 0 # 4 0 0 1 # 5 0 0 1 # 6 1 0 0
Have a look at the previous output of the RStudio console. Our input vector was converted to a data frame consisting of three dummy indicators that correspond to the three different values of our input vector.
Example 3: Generate Random Dummy Vector Using rbinom() Function
It is also possible to generate random binomial dummy indicators using the rbinom function.
The following R code generates a dummy that is equal to 1 in 30% of the cases and equal to 0 in 70% of the cases:
set.seed(9376562) # Set random seed dummy3 <- rbinom(n = 10, size = 1, prob = 0.3) # Applying rbinom function dummy3 # Print dummy # [1] 1 0 0 1 0 1 0 1 0 0 |
set.seed(9376562) # Set random seed dummy3 <- rbinom(n = 10, size = 1, prob = 0.3) # Applying rbinom function dummy3 # Print dummy # [1] 1 0 0 1 0 1 0 1 0 0
Video, Further Resources & Summary
Do you need more info on the R code of this tutorial? Then I can recommend to watch the following video of the Statistics Globe YouTube channel. I explain the R programming codes of the present article in the video:
The YouTube video will be added soon.
In addition, you might want to have a look at the related articles that I have published on https://www.statisticsglobe.com/:
You learned in this tutorial how to make a dummy in the R programming language. If you have additional questions, please let me know in the comments. In addition, don’t forget to subscribe to my email newsletter for updates on new tutorials.
Subscribe to my free statistics newsletter: