# Convert Factor to Dummy Indicator Variables for Every Level in R (Example)

This page explains how to expand a factor column to dummy variables for each factor level in the R programming language.

## Creation of Example Data

In the example of this R programming tutorial, we’ll use the following data frame in R:

```data <- data.frame(x1 = c("a", "b", "a", "XXX", "C", "b", "abc"),   # Create example data
x2 = 1,
x3 = 2)
data                                                                # Print example data
#    x1 x2 x3
# 1   a  1  2
# 2   b  1  2
# 3   a  1  2
# 4 XXX  1  2
# 5   C  1  2
# 6   b  1  2
# 7 abc  1  2```

Our example data consists of seven rows and three columns. The first column, i.e. the variable x1, is a factor with five different factor levels.

## Example: Converting Factor to 1/0 Dummy Indicator

If we want to expand our data frame so that every factor level of x1 is represented in a dummy column, we can use the model.matrix function as shown below:

```model.matrix( ~ x1 - 1, data)                                       # Convert to dummies
#   x1a x1abc x1b x1C x1XXX
# 1   1     0   0   0     0
# 2   0     0   1   0     0
# 3   1     0   0   0     0
# 4   0     0   0   0     1
# 5   0     0   0   1     0
# 6   0     0   1   0     0
# 7   0     1   0   0     0
# attr(,"assign")
# [1] 1 1 1 1 1
# attr(,"contrasts")
# attr(,"contrasts")\$x1
# [1] "contr.treatment"```

As you can see based on the output of the RStudio console, the output of the previous R syntax is a dummy matrix representing our factor variable x1.

If we want to merge these dummies to our original data frame, we can use the following R programming code:

```data_dummy <- data.frame(data[ , ! colnames(data) %in% "x1"],       # Create dummy data
model.matrix( ~ x1 - 1, data))
data_dummy                                                          # Print dummy data
#   x2 x3 x1a x1abc x1b x1C x1XXX
# 1  1  2   1     0   0   0     0
# 2  1  2   0     0   1   0     0
# 3  1  2   1     0   0   0     0
# 4  1  2   0     0   0   0     1
# 5  1  2   0     0   0   1     0
# 6  1  2   0     0   1   0     0
# 7  1  2   0     1   0   0     0```

The final output consists of the variables of our original data frame (except x1) plus the dummy variables that are reflecting the factor levels of x1.

