Estimate Linear Model with Fixed Intercept in R (2 Examples)

In this post you’ll learn how to set a fixed intercept when estimating a linear regression model in the R programming language.

The post is structured as follows:

1) Creation of Example Data

2) Example 1: Estimate Linear Model with Fixed Intercept Using I() Function

3) Example 2: Estimate Linear Model with Fixed Intercept Using offset() & rep() Functions

4) Video, Further Resources & Summary

5) Subscribe to the Statistics Globe Newsletter

6) Thank you!

It’s time to dive into the programming part.

Creation of Example Data

The data below will be used as basement for this R tutorial:

set.seed(653897)                                   # Create example data
x <- rnorm(1000, 3)
y <- rnorm(1000, 2) + x

Our example data consists of two randomly distributed numeric vectors that are correlated with each other.

Let’s estimate a linear regression model without specifying the intercept manually (i.e. the default specification of the lm function):

mod_default <- lm(y ~ x)                           # Estimate linear model
summary(mod_default)                               # Summary statistics
# Call:
# lm(formula = y ~ x)
# 
# Residuals:
#     Min      1Q  Median      3Q     Max 
# -3.3152 -0.6598  0.0209  0.6563  3.4294 
# 
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept)  2.05729    0.09966   20.64   <2e-16 ***
# x            0.98086    0.03156   31.08   <2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 0.9891 on 998 degrees of freedom
# Multiple R-squared:  0.4919,	Adjusted R-squared:  0.4914 
# F-statistic: 966.1 on 1 and 998 DF,  p-value: < 2.2e-16

The previous output of the RStudio console shows the descriptive summary statistics of our linear regression model. As you can see, we have estimated an intercept of 2.05729 and a regression coefficient for x of 0.98086.

Let’s estimate another model with fixed intercept…

Example 1: Estimate Linear Model with Fixed Intercept Using I() Function

Example 1 illustrates how to estimate a generalized linear model with known intercept.

For this, we first have to specify our fixed intercept:

intercept <- 3                                     # Define fixed intercept

Next, we can estimate our linear model using the I() function as shown below:

mod_intercept_1 <- lm(I(y - intercept) ~ 0 + x)    # Model with fixed intercept

Finally, we can apply the summary function to return our descriptive statistics:

summary(mod_intercept_1)                           # Summary statistics
# Call:
# lm(formula = I(y - intercept) ~ 0 + x)
# 
# Residuals:
#     Min      1Q  Median      3Q     Max 
# -3.0314 -0.7734 -0.0577  0.6222  3.1767 
# 
# Coefficients:
#   Estimate Std. Error t value Pr(>|t|)    
# x  0.69743    0.01033   67.49   <2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 1.032 on 999 degrees of freedom
# Multiple R-squared:  0.8201,	Adjusted R-squared:   0.82 
# F-statistic:  4555 on 1 and 999 DF,  p-value: < 2.2e-16

As you can see, the previously estimated model did not return an intercept value, since we have manually specified this intercept in the forefront.

You can also see that the x estimate has changed to 0.69743.

Important notes on models with fixed intercept:

The summary output of models with fixed intercept has to be interpreted carefully. Metrics such as the R-squared, the t-value, and the F-statistic are much larger than in the model without fixed intercept.

Furthermore, it is often not advisable to specify a fixed intercept from a theoretical & methodological viewpoint. You may find a detailed discussion on this topic in this thread on Cross Validated.

Example 2: Estimate Linear Model with Fixed Intercept Using offset() & rep() Functions

This example shows a second alternative to the syntax of the previous example.

In this example we’ll use the offset and rep functions to estimate our linear model with known intercept:

mod_intercept_2 <- lm(y ~ x + 0 +                  # Model with fixed intercept
                        offset(rep(intercept, 1000)))

The following summary statistics are exactly the same as in Example 1, even though we have used a different R syntax:

summary(mod_intercept_2)                           # Summary statistics
# Call:
# lm(formula = y ~ x + 0 + offset(rep(intercept, 1000)))
# 
# Residuals:
#     Min      1Q  Median      3Q     Max 
# -3.0314 -0.7734 -0.0577  0.6222  3.1767 
# 
# Coefficients:
#   Estimate Std. Error t value Pr(>|t|)    
# x  0.69743    0.01033   67.49   <2e-16 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
# Residual standard error: 1.032 on 999 degrees of freedom
# Multiple R-squared:  0.9613,	Adjusted R-squared:  0.9612 
# F-statistic: 2.479e+04 on 1 and 999 DF,  p-value: < 2.2e-16

Video, Further Resources & Summary

In case you need further explanations on the content of this article, you may have a look at the following video on my YouTube channel. I’m illustrating the R codes of this article in the video:

In addition, you could read the other tutorials of my website.

In summary: At this point you should know how to define a known constant in a linear regression model in R programming. Let me know in the comments section, if you have further questions or comments on regression models, constants, or any other related topics.

5 Comments. Leave new

Andre ENDRESS
September 8, 2021 9:42 am

Hello,
there is a differenz in R-Square and F-Statistik between the two methods, but why?
BR Andre

Reply
- Joachim
  September 8, 2021 12:15 pm
  
  Hey Andre,
  
  Indeed, that’s an interesting question! My guess would be that the fixed intercept has an impact on these two metrics, but to be honest, I’m not 100% sure about this.
  
  Please let me know in case you find a good explanation, I’m also curious now 🙂
  
  Regards
  
  Joachim
  
  Reply
  - ti
    January 6, 2022 5:01 pm
    
    It is because different null models for the F-test. 🙂
    
    Reply
    - Joachim
      January 7, 2022 9:55 am
      
      Ah ok, thanks for sharing! 🙂
      
      Reply
- Joachim (Statistics Globe)
  January 2, 2023 12:45 pm
  
  Since this discussion popped up once again in the comments of the YouTube video, I have done some further research and found this thread at Cross Validated. It explains this topic quite nicely.
  
  I hope this helps!
  
  Joachim
  
  Reply