# Estimate Linear Model with Fixed Intercept in R (2 Examples)

In this post you’ll learn how to set a fixed intercept when estimating a linear regression model in the R programming language.

The post is structured as follows:

It’s time to dive into the programming part.

## Creation of Example Data

The data below will be used as basement for this R tutorial:

```set.seed(653897)                                   # Create example data
x <- rnorm(1000, 3)
y <- rnorm(1000, 2) + x```

Our example data consists of two randomly distributed numeric vectors that are correlated with each other.

Let’s estimate a linear regression model without specifying the intercept manually (i.e. the default specification of the lm function):

```mod_default <- lm(y ~ x)                           # Estimate linear model
summary(mod_default)                               # Summary statistics
# Call:
# lm(formula = y ~ x)
#
# Residuals:
#     Min      1Q  Median      3Q     Max
# -3.3152 -0.6598  0.0209  0.6563  3.4294
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)  2.05729    0.09966   20.64   <2e-16 ***
# x            0.98086    0.03156   31.08   <2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 0.9891 on 998 degrees of freedom
# Multiple R-squared:  0.4919,	Adjusted R-squared:  0.4914
# F-statistic: 966.1 on 1 and 998 DF,  p-value: < 2.2e-16```

The previous output of the RStudio console shows the descriptive summary statistics of our linear regression model. As you can see, we have estimated an intercept of 2.05729 and a regression coefficient for x of 0.98086.

Let’s estimate another model with fixed intercept…

## Example 1: Estimate Linear Model with Fixed Intercept Using I() Function

Example 1 illustrates how to estimate a generalized linear model with known intercept.

For this, we first have to specify our fixed intercept:

`intercept <- 3                                     # Define fixed intercept`

Next, we can estimate our linear model using the I() function as shown below:

`mod_intercept_1 <- lm(I(y - intercept) ~ 0 + x)    # Model with fixed intercept`

Finally, we can apply the summary function to return our descriptive statistics:

```summary(mod_intercept_1)                           # Summary statistics
# Call:
# lm(formula = I(y - intercept) ~ 0 + x)
#
# Residuals:
#     Min      1Q  Median      3Q     Max
# -3.0314 -0.7734 -0.0577  0.6222  3.1767
#
# Coefficients:
#   Estimate Std. Error t value Pr(>|t|)
# x  0.69743    0.01033   67.49   <2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 1.032 on 999 degrees of freedom
# Multiple R-squared:  0.8201,	Adjusted R-squared:   0.82
# F-statistic:  4555 on 1 and 999 DF,  p-value: < 2.2e-16```

As you can see, the previously estimated model did not return an intercept value, since we have manually specified this intercept in the forefront.

You can also see that the x estimate has changed to 0.69743.

Important notes on models with fixed intercept:

The summary output of models with fixed intercept has to be interpreted carefully. Metrics such as the R-squared, the t-value, and the F-statistic are much larger than in the model without fixed intercept.

Furthermore, it is often not advisable to specify a fixed intercept from a theoretical & methodological viewpoint. You may find a detailed discussion on this topic in this thread on Cross Validated.

## Example 2: Estimate Linear Model with Fixed Intercept Using offset() & rep() Functions

This example shows a second alternative to the syntax of the previous example.

In this example we’ll use the offset and rep functions to estimate our linear model with known intercept:

```mod_intercept_2 <- lm(y ~ x + 0 +                  # Model with fixed intercept
offset(rep(intercept, 1000)))```

The following summary statistics are exactly the same as in Example 1, even though we have used a different R syntax:

```summary(mod_intercept_2)                           # Summary statistics
# Call:
# lm(formula = y ~ x + 0 + offset(rep(intercept, 1000)))
#
# Residuals:
#     Min      1Q  Median      3Q     Max
# -3.0314 -0.7734 -0.0577  0.6222  3.1767
#
# Coefficients:
#   Estimate Std. Error t value Pr(>|t|)
# x  0.69743    0.01033   67.49   <2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 1.032 on 999 degrees of freedom
# Multiple R-squared:  0.9613,	Adjusted R-squared:  0.9612
# F-statistic: 2.479e+04 on 1 and 999 DF,  p-value: < 2.2e-16```

## Video, Further Resources & Summary

In case you need further explanations on the content of this article, you may have a look at the following video on my YouTube channel. I’m illustrating the R codes of this article in the video:

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party. If you accept this notice, your choice will be saved and the page will refresh.

In summary: At this point you should know how to define a known constant in a linear regression model in R programming. Let me know in the comments section, if you have further questions or comments on regression models, constants, or any other related topics.

Subscribe to the Statistics Globe Newsletter

• Andre ENDRESS
September 8, 2021 9:42 am

Hello,
there is a differenz in R-Square and F-Statistik between the two methods, but why?
BR Andre

• September 8, 2021 12:15 pm

Hey Andre,

Indeed, that’s an interesting question! My guess would be that the fixed intercept has an impact on these two metrics, but to be honest, I’m not 100% sure about this.

Please let me know in case you find a good explanation, I’m also curious now 🙂

Regards

Joachim

• ti
January 6, 2022 5:01 pm

It is because different null models for the F-test. 🙂

• January 7, 2022 9:55 am

Ah ok, thanks for sharing! 🙂

• January 2, 2023 12:45 pm