# R Error: contrasts can be applied only to factors with 2 or more levels

This tutorial illustrates how to handle the error message “contrasts can be applied only to factors with 2 or more levels” in the R programming language.

The article contains this content:

Here’s how to do it…

## Creation of Exemplifying Data

The following data will be used as basement for this R programming tutorial:

```data <- data.frame(x1 = c(1, 4, 3, 1, 5, 5), # Create example data x2 = c(7, 7, 7, 1, 1, 2), x3 = as.factor(5), y = c(4, 3, 2, 5, 5, 1)) data # Print example data # x1 x2 x3 y # 1 1 7 5 4 # 2 4 7 5 3 # 3 3 7 5 2 # 4 1 1 5 5 # 5 5 1 5 5 # 6 5 2 5 1```

As you can see based on the previous output of the RStudio console, the example data has six rows and four columns. The variable x1, x2, and x3 are our predictors and the variable y is our target variable.

## Example 1: Reproduce the Error: contrasts can be applied only to factors with 2 or more levels

The following R code shows how to replicate the error message “contrasts can be applied only to factors with 2 or more levels”.

Let’s assume that we want to estimate a linear model of our data using the lm function in R:

```lm(y ~ ., data) # Trying to apply lm() # Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : # contrasts can be applied only to factors with 2 or more levels```

As you can see, the lm function returned the error “contrasts can be applied only to factors with 2 or more levels” to the RStudio console.

The reason for this is that one of our predictor variables has only one factor level.

Have a look at the column x3. As you can see, this column does only consist of the value five.

Please note that this error message might also occur when your data contains NA values, even when all of your factor columns consist of more than one factor level.

The reason for this is that the lm function performs listwise deletion to remove all rows with NA values from your data.

If the retained complete data consists of columns with only one factor level, the error message “contrasts can be applied only to factors with 2 or more levels” appears.

So how can we solve this problem? That’s what I’m going to show next!

## Example 2: Fix the Error: contrasts can be applied only to factors with 2 or more levels

Example 2 shows how to deal with the error message “contrasts can be applied only to factors with 2 or more levels”.

As explained in Example 1, this error occurs due to one-level factor variables. So the first step is to identify those variables in our data.

We can do that by using the sapply and lapply functions in combination with the unique and length functions:

```values_count <- sapply(lapply(data, unique), length) # Identify variables with 1 value values_count # Print counts of different values # x1 x2 x3 y # 4 3 1 5```

The previous R code returned a named vector showing the number of different values in each of our columns. As you can see, the variable x3 contains the same value in each data cell.

We can now use this vector to subset our data frame within the lm function so that only variables with more than one value are used as predictors:

```lm(y ~ ., data[ , values_count > 1]) # Apply lm() to subset of data # Call: # lm(formula = y ~ ., data = data[, values_count > 1]) # # Coefficients: # (Intercept) x1 x2 # 5.8534 -0.4788 -0.2409```

The lm functions returns a valid output – looks good!

## Video & Further Resources

Have a look at the following video of my YouTube channel. I’m explaining the content of this article in the video tutorial:

Furthermore, you may have a look at the other articles of my website. Please find a selection of posts below:

In this tutorial you have learned how to deal with the error “contrasts can be applied only to factors with 2 or more levels” in the R programming language. Don’t hesitate to let me know in the comments, if you have additional questions or comments.

Subscribe to the Statistics Globe Newsletter

• Mayur Dhage
April 4, 2021 6:12 pm

Hello Joachim! Excellent explanation as always.
I am making a logistic regression model in R and having same “error in contrast” thing which is so frustrating. All the factors has more than two levels but still this is error is showing up. Probably the reason is NA values. I confirmed it with “> sapply(train.glm, function(x) if (is.factor(x)) length(levels(x)) else NA)” and it was showing two variables with NA. But while performing “is.na(data)” no NA values were shown.
Can you please put up a solution when “error in contrast” is due to NA values ?

• Hi Mayur,

Thank you for the very kind words!

I recommend creating a subset of your data, where all rows with NA values have been removed. You can learn how to do that here: https://statisticsglobe.com/r-remove-data-frame-rows-with-some-or-all-na

Afterwards, you can use the code shown in this tutorial to identify variables with only one factor level (i.e. Example 2).

I hope that helps!

Joachim