summary Function in R (3 Examples)

 

This article shows how to compute descriptive statistics using the summary function in the R programming language.

Table of contents:

If you want to learn more about these content blocks, keep reading.

Definition & Basic R Syntax of summary Function

 

Definition: The summary R function computes summary statistics of data and model objects.

 

Basic R Syntax: Please find the basic R programming syntax of the summary function below.

summary(data)                   # Basic R syntax of summary function

 

In the following, I’ll show three examples for the application of the summary function in R.

Example 1: Applying summary Function to Vector

Example 1 illustrates how to apply the summary function to a numeric vector. First, we have to create a numeric vector in R:

vec <- 1:10                     # Create example vector
vec                             # Print example vector
# 1  2  3  4  5  6  7  8  9 10

As you can see based on the previous RStudio console output, our example vector ranges from 1 to 10.

Now, we can use the summary command to calculate summary statistics of our vector:

summary(vec)                    # Apply summary function to vector
# Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
# 1.00    3.25    5.50    5.50    7.75   10.00

The summary function returned descriptive statistics such as the minimum, the first quantile, the median, the mean, the 3rd quantile, and the maximum value of our input data.

 

Example 2: Applying summary Function to Data Frame

We can also apply the summary function to other objects. The following R programming syntax shows how to compute descriptive statistics of a data frame. First, we have to construct a data frame in R:

data <- data.frame(x1 = 1:5,    # Create example data frame
                   x2 = letters[1:5],
                   x3 = 3)
data                            # Print example data frame
#   x1 x2 x3
# 1  1  a  3
# 2  2  b  3
# 3  3  c  3
# 4  4  d  3
# 5  5  e  3

Our data frame contains five rows and three columns. We can now use the summary function to return summary statistics for each of the variables of this data frame to the RStudio console:

summary(data)                   # Apply summary function to data frame
#        x1    x2          x3   
# Min.   :1   a:1   Min.   :3  
# 1st Qu.:2   b:1   1st Qu.:3  
# Median :3   c:1   Median :3  
# Mean   :3   d:1   Mean   :3  
# 3rd Qu.:4   e:1   3rd Qu.:3  
# Max.   :5         Max.   :3

 

Example 3: Applying summary Function to Linear Regression Model

A very common application of the summary function it the computation of summary statistics of statistical models. In this Example, I’ll explain how to create summary statistics of a linear regression model. Again, we have to create some example data first:

set.seed(93274)                 # Create random data
my_x <- rnorm(1000)
my_y <- rnorm(1000) + my_x

Our example data consists of two randomly distributed numeric vectors.

We can estimate a linear regression model using the lm function as shown below:

mod <- lm(my_y ~ my_x)          # Estimate linear regression model

The data object mod contains the output of our linear regression.

Now, we can apply the summary function to this model object to print summary statistics for this model:

summary(mod)                    # Apply summary function to model
# Call:
# lm(formula = my_y ~ my_x)
# 
# Residuals:
#     Min      1Q  Median      3Q     Max 
# -3.7337 -0.6964 -0.0047  0.7333  3.3489 
# 
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept) -0.02159    0.03292  -0.656    0.512    
# my_x         1.00156    0.03262  30.707   <2e-16 ***
#   ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 
# Residual standard error: 1.041 on 998 degrees of freedom
# Multiple R-squared:  0.4858,	Adjusted R-squared:  0.4853 
# F-statistic: 942.9 on 1 and 998 DF,  p-value: < 2.2e-16

The previous output shown descriptive statistics such as regression coefficients, standard errors, p-values, significance levels, the intercept, the R-squared, and the F-statistic.

 

Video, Further Resources & Summary

If you need more explanations on the R codes of this tutorial, I can recommend to watch the following video of my YouTube channel. I’m explaining the content of this article in the video.

 

The YouTube video will be added soon.

 

Furthermore, you might read the other articles that I have published on this homepage.

 

You learned in this article how to calculate object summaries in the R programming language. Don’t hesitate to let me know in the comments section, if you have additional questions.

 



Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top