# Estimate Linear Model with Fixed Intercept in R (2 Examples)

In this post you’ll learn how to set a fixed intercept when estimating a linear regression model in the R programming language.

The post is structured as follows:

It’s time to dive into the programming part.

## Creation of Example Data

The data below will be used as basement for this R tutorial:

```set.seed(653897) # Create example data x <- rnorm(1000, 3) y <- rnorm(1000, 2) + x```

Our example data consists of two randomly distributed numeric vectors that are correlated with each other.

Let’s estimate a linear regression model without specifying the intercept manually (i.e. the default specification of the lm function):

```mod_default <- lm(y ~ x) # Estimate linear model summary(mod_default) # Summary statistics # Call: # lm(formula = y ~ x) # # Residuals: # Min 1Q Median 3Q Max # -3.3152 -0.6598 0.0209 0.6563 3.4294 # # Coefficients: # Estimate Std. Error t value Pr(>|t|) # (Intercept) 2.05729 0.09966 20.64 <2e-16 *** # x 0.98086 0.03156 31.08 <2e-16 *** # --- # Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # # Residual standard error: 0.9891 on 998 degrees of freedom # Multiple R-squared: 0.4919, Adjusted R-squared: 0.4914 # F-statistic: 966.1 on 1 and 998 DF, p-value: < 2.2e-16```

The previous output of the RStudio console shows the descriptive summary statistics of our linear regression model. As you can see, we have estimated an intercept of 2.05729 and a regression coefficient for x of 0.98086.

Let’s estimate another model with fixed intercept…

## Example 1: Estimate Linear Model with Fixed Intercept Using I() Function

Example 1 illustrates how to estimate a generalized linear model with known intercept.

For this, we first have to specify our fixed intercept:

`intercept <- 3 # Define fixed intercept`

Next, we can estimate our linear model using the I() function as shown below:

`mod_intercept_1 <- lm(I(y - intercept) ~ 0 + x) # Model with fixed intercept`

Finally, we can apply the summary function to return our descriptive statistics:

```summary(mod_intercept_1) # Summary statistics # Call: # lm(formula = I(y - intercept) ~ 0 + x) # # Residuals: # Min 1Q Median 3Q Max # -3.0314 -0.7734 -0.0577 0.6222 3.1767 # # Coefficients: # Estimate Std. Error t value Pr(>|t|) # x 0.69743 0.01033 67.49 <2e-16 *** # --- # Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # # Residual standard error: 1.032 on 999 degrees of freedom # Multiple R-squared: 0.8201, Adjusted R-squared: 0.82 # F-statistic: 4555 on 1 and 999 DF, p-value: < 2.2e-16```

As you can see, the previously estimated model did not return an intercept value, since we have manually specified this intercept in the forefront.

You can also see that the x estimate has changed to 0.69743.

Important notes on models with fixed intercept:

The summary output of models with fixed intercept has to be interpreted carefully. Metrics such as the R-squared, the t-value, and the F-statistic are much larger than in the model without fixed intercept.

Furthermore, it is often not advisable to specify a fixed intercept from a theoretical & methodological viewpoint. You may find a detailed discussion on this topic in this thread on Cross Validated.

## Example 2: Estimate Linear Model with Fixed Intercept Using offset() & rep() Functions

This example shows a second alternative to the syntax of the previous example.

In this example we’ll use the offset and rep functions to estimate our linear model with known intercept:

```mod_intercept_2 <- lm(y ~ x + 0 + # Model with fixed intercept offset(rep(intercept, 1000)))```

The following summary statistics are exactly the same as in Example 1, even though we have used a different R syntax:

```summary(mod_intercept_2) # Summary statistics # Call: # lm(formula = y ~ x + 0 + offset(rep(intercept, 1000))) # # Residuals: # Min 1Q Median 3Q Max # -3.0314 -0.7734 -0.0577 0.6222 3.1767 # # Coefficients: # Estimate Std. Error t value Pr(>|t|) # x 0.69743 0.01033 67.49 <2e-16 *** # --- # Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # # Residual standard error: 1.032 on 999 degrees of freedom # Multiple R-squared: 0.9613, Adjusted R-squared: 0.9612 # F-statistic: 2.479e+04 on 1 and 999 DF, p-value: < 2.2e-16```

## Video, Further Resources & Summary

In case you need further explanations on the content of this article, you may have a look at the following video on my YouTube channel. I’m illustrating the R codes of this article in the video:

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party. If you accept this notice, your choice will be saved and the page will refresh.

In summary: At this point you should know how to define a known constant in a linear regression model in R programming. Let me know in the comments section, if you have further questions or comments on regression models, constants, or any other related topics.

Subscribe to the Statistics Globe Newsletter

• Andre ENDRESS
September 8, 2021 9:42 am

Hello,
there is a differenz in R-Square and F-Statistik between the two methods, but why?
BR Andre

• Hey Andre,

Indeed, that’s an interesting question! My guess would be that the fixed intercept has an impact on these two metrics, but to be honest, I’m not 100% sure about this.

Please let me know in case you find a good explanation, I’m also curious now 🙂

Regards

Joachim

• ti
January 6, 2022 5:01 pm

It is because different null models for the F-test. 🙂