Extract Regression Coefficients of Linear Model in R (Example)

 

This tutorial illustrates how to return the regression coefficients of a linear model estimation in R programming.

The content of the tutorial looks like this:

So without further ado, let’s get started:

 

Constructing Example Data

We use the following data as basement for this tutorial:

set.seed(87634)                                       # Create random example data
x1 <- rnorm(1000)
x2 <- rnorm(1000) + 0.3 * x1
x3 <- rnorm(1000) + 0.1 * x1 + 0.2 * x2
x4 <- rnorm(1000) + 0.2 * x1 - 0.3 * x3
x5 <- rnorm(1000) - 0.1 * x2 + 0.1 * x4
y <- rnorm(1000) + 0.1 * x1 - 0.2 * x2 + 0.1 * x3 + 0.1 * x4 - 0.2 * x5
data <- data.frame(y, x1, x2, x3, x4, x5)
head(data)                                            # Head of data
#            y          x1          x2         x3         x4          x5
# 1 -0.6441526 -0.42219074 -0.12603789 -0.6812755  0.9457604 -0.39240211
# 2 -0.9063134 -0.19953976 -0.35341624  1.0024131  1.3120547  0.05489608
# 3 -0.8873880  0.30450638 -0.58551780 -1.1073109 -0.2047048  0.44607502
# 4  0.4567184  1.33299913 -0.05512412 -0.5772521  0.3476488  1.65124595
# 5  0.6631039 -0.36705475 -0.26633088  1.0520141 -0.3281474  0.77052209
# 6  1.3952174  0.03528151 -2.43580550 -0.6727582  1.8374260  1.06429782

The previously shown RStudio console output shows the structure of our example data – It’s a data frame consisting of six numeric columns. The first variable y is the outcome variable. The remaining variables x1-x5 are the predictors.

 

Example: Extracting Coefficients of Linear Model

In this Example, I’ll illustrate how to estimate and save the regression coefficients of a linear model in R. First, we have to estimate our statistical model using the lm and summary functions:

summary(lm(y ~ ., data))                              # Estimate model
# Call:
# lm(formula = y ~ ., data = data)
# 
# Residuals:
#   Min      1Q  Median      3Q     Max 
# -2.9106 -0.6819 -0.0274  0.7197  3.8374 
# 
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept) -0.01158    0.03204  -0.362 0.717749    
# x1           0.10656    0.03413   3.122 0.001847 ** 
# x2          -0.17723    0.03370  -5.259 1.77e-07 ***
# x3           0.11174    0.03380   3.306 0.000982 ***
# x4           0.09933    0.03295   3.015 0.002638 ** 
# x5          -0.24871    0.03323  -7.485 1.57e-13 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# Residual standard error: 1.011 on 994 degrees of freedom
# Multiple R-squared:  0.08674,	Adjusted R-squared:  0.08214 
# F-statistic: 18.88 on 5 and 994 DF,  p-value: < 2.2e-16

The previous output of the RStudio console shows all the estimates we need. However, the coefficient values are not stored in a handy format. Let’s therefore convert the summary output of our model into a data matrix:

matrix_coef <- summary(lm(y ~ ., data))$coefficients  # Extract coefficients in matrix
matrix_coef                                           # Return matrix of coefficients
#                Estimate Std. Error    t value     Pr(>|t|)
# (Intercept) -0.01158450 0.03203930 -0.3615716 7.177490e-01
# x1           0.10656343 0.03413045  3.1222395 1.846683e-03
# x2          -0.17723211 0.03369896 -5.2592753 1.770787e-07
# x3           0.11174223 0.03380415  3.3055772 9.817042e-04
# x4           0.09932518 0.03294739  3.0146597 2.637990e-03
# x5          -0.24870659 0.03322673 -7.4851370 1.572040e-13

The previous R code saved the coefficient estimates, standard errors, t-values, and p-values in a typical matrix format.

Now, we can apply any matrix manipulation to our matrix of coefficients that we want. For instance, we may extract only the coefficient estimates by subsetting our matrix:

my_estimates <- matrix_coef[ , 1]                     # Matrix manipulation to extract estimates
my_estimates                                          # Print estimates
# (Intercept)          x1          x2          x3          x4          x5 
# -0.01158450  0.10656343 -0.17723211  0.11174223  0.09932518 -0.24870659

That’s it. Now you can do whatever you want with your regression output!

 

Video & Further Resources

I have recently released a video on my YouTube channel, which shows the R codes of this tutorial. Please find the video below:

 

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube Content Consent Button Thumbnail

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

 

Besides the video, you might have a look at the related articles of this website.

 

This tutorial explained how to extract the coefficient estimates of a statistical model in R. Please let me know in the comments section, in case you have additional questions.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


12 Comments. Leave new

  • Hi,
    How do I do the same with a simple linear regression output to compare both tables?

    Thank you

    Reply
    • Hi Brent,

      This depends on the values/estimates you want to compare. You may run the code shown in this tutorial to as many different linear regression models as you want.

      Regards

      Joachim

      Reply
  • Hi Joachim,
    I want to use those intercepts in an expression, how do I do that?

    Reply
  • Hi Joachim,

    How do you extract the information at the bottom of the summary() output:

    Residual standard error: 1.011 on 994 degrees of freedom
    Multiple R-squared: 0.08674
    Adjusted R-squared: 0.08214
    F-statistic: 18.88 on 5 and 994 DF, p-value: < 2.2e-16
    ?

    Is there a way to view the structure of functions like summary() to ascertain how the numbers are stored in them, in general?

    Thanks,
    Eric

    Reply
  • Hey Eric,
    Do you know how I can use the estimated regression coefficients to predict another dataset?

    Reply
    • Hey Christian,

      I don’t know who Eric is 🙂 but I would need to get some more information on what you would like to do.

      Could you please provide some example data and explain your problem in some more detail?

      Regards,
      Joachim

      Reply
  • Hi Joachim,

    I am running a multilevel model for the meta-analysis using the following code:
    output5_MR = map(metrics4,
    function(i) metadata1 %>%
    dplyr::filter(measurement_n==i) %>%
    rma.mv(lnrr, v, random = ~ 1 | publication_title / unique_id, mods = ~ duration_exp +
    feedstock_rename + temp_group + Biochar_app_rate + fertilizer_app_rate +
    manure_app_rate + continent + soil_texture + soil_sample_depth_min
    + soil_sample_depth_max + country + annual_temp,
    method = “REML”, data=.))

    where “metrics4” is the dependent variable I am interested in (there are 8 dependent variables). So when I run this model I get results for all these effect modifiers. I would like to extract the estimates and standard error of specific effect modifier like for example “Biochar_app_rate”. Do you have any idea how to do it. So I kind of tried these codes
    output5_MR_b <- map_dbl(output5_MR$Biochar_app_rate,
    function(x) as.numeric(x[["b"]]) )

    output5_MR_se <- map_dbl(output5_MR$Biochar_app_rate,
    function(x) as.numeric(x[["se"]]) )

    But these codes will help extract b and se of all the effect modifiers listed in the model. I am only interested in one of them. If you know how to do it, could you please share it with me? Thanks in advance

    Reply
    • Hey Madina,

      Please excuse the delayed response. I was on a long vacation, so unfortunately I wasn’t able to get back to you sooner. Still need help with your code?

      Regards,
      Joachim

      Reply
  • Hi Joachim,

    I ran a meta-regression analysis in `metafor` package using the following code

    “`
    output5_MR = map(metrics4,
    #magrittr::extract(!. %in% c(“Soil NPK availability”, “Nutrient use efficiency”)),
    function(i) metadata1 %>%
    dplyr::filter(measurement_n==i) %>%
    rma.mv(lnrr, v, random = ~ 1 | publication_title / unique_id, mods = ~ duration_exp +
    feedstock_rename + temp_group + Biochar_app_rate + fertilizer_app_rate +
    manure_app_rate + continent + soil_texture + soil_sample_depth_min
    + soil_sample_depth_max + country + annual_temp,
    method = “REML”, data=.))
    “`

    `metrics4` contains 9 dependent variables. When I ran the model I get the results for all effect modifiers (which are given with the `mods` function) together with intercept. I would like to extract the estimate `b` value and standard error `se` of a certain effect modifier, let’s say Biochar_app_rate. I tried the following code

    “`
    output5_MR_b <- map_dbl(output5_MR,
    function(x) as.numeric(x[["b"]]) )

    output5_MR_se <- map_dbl(output5_MR$Biochar_app_rate,
    function(x) as.numeric(x[["se"]]) )
    “`

    But these codes help me extract `se` and `b` for all the effect modifiers. Do you know how to create a code for extracting `b` and `se` of certain effect modifiers?

    Thanks in advance

    Reply
    • Hey Madina,

      Please excuse the delayed response. I was on a long vacation, so unfortunately I wasn’t able to get back to you earlier. Do you still need help with your code?

      Regards,
      Joachim

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top