Fitting Polynomial Regression Model in R (3 Examples)


In this post, I’ll explain how to estimate a polynomial regression model in the R programming language.

The article is structured as follows:

Let’s dive right in!


Creation of Example Data

The following data will be used as basement for this R tutorial:

set.seed(756328)                        # Create example data
x <- rnorm(100)
y <- rnorm(100) + x

Our example data consists of two numeric vectors x and y. We’ll use y as target variable and x as predictor variable.


Example 1: Manually Specify Polynomial Regression Model

This example illustrates how to perform a polynomial regression analysis by coding the polynomials manually.

For this, we can use the lm() and I() functions as shown below:

lm(y ~ x + I(x^2) + I(x^3) + I(x^4))    # Manually specify fourth order polynomial
# Call:
# lm(formula = y ~ x + I(x^2) + I(x^3) + I(x^4))
# Coefficients:
# (Intercept)            x       I(x^2)       I(x^3)       I(x^4)  
#     0.13584      1.24637     -0.27315     -0.04925      0.04200

As you can see based on the previous output of the RStudio console, we have fitted a regression model with fourth order polynomial.


Example 2: Applying poly() Function to Fit Polynomial Regression Model

Depending on the order of your polynomial regression model, it might be inefficient to program each polynomial manually (as shown in Example 1).

To make our code more efficient, we can use the poly function provided by the basic installation of the R programming language:

lm(y ~ poly(x, 4, raw = TRUE))          # Using poly function
# Call:
# lm(formula = y ~ poly(x, 4, raw = TRUE))
# Coefficients:
#             (Intercept)  poly(x, 4, raw = TRUE)1  poly(x, 4, raw = TRUE)2  poly(x, 4, raw = TRUE)3  poly(x, 4, raw = TRUE)4  
#                 0.13584                  1.24637                 -0.27315                 -0.04925                  0.04200

Note that we have specified the raw argument within the poly function to be equal to TRUE. By doing this, we have ensured that the result is the same as in Example 1.

Does this make sense? I’ll explain in the next example…


Example 3: Applying poly() Function to Fit Polynomial Regression Model with Orthogonal Polynomials

Both, the manual coding (Example 1) and the application of the poly function with raw = TRUE (Example 2) use raw polynomials.

However, depending on your situation you might prefer to use orthogonal (i.e. uncorrelated) polynomials.

For this, we simply have to remove the raw argument from our R syntax (the default specifications of the poly function set the raw argument to be equal to FALSE):

lm(y ~ poly(x, 4))                      # Use orthogonal polynomials
# Call:
# lm(formula = y ~ poly(x, 4))
# Coefficients:
# (Intercept)  poly(x, 4)1  poly(x, 4)2  poly(x, 4)3  poly(x, 4)4  
#    -0.03016     11.67261     -0.26362     -1.45849      1.57512

As you can see, the coefficients of our previous polynomial regression model are different compared to Examples 1 and 2, because this time we used orthogonal polynomials.


Video, Further Resources & Summary

Do you need further explanations on the R programming syntax of this article? Then you could watch the following video of my YouTube channel. I’m illustrating the topics of this tutorial in the video.


The YouTube video will be added soon.


In addition, you could read the related posts on my homepage.


In summary: At this point you should have learned how to fit polynomial regression models in the R programming language. Let me know in the comments section, in case you have additional questions or comments.


Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.