Standardize Data Frame Columns in R (2 Examples) | scale Function

 

In this tutorial, I’ll illustrate how to scale a data frame in R programming.

Table of contents:

Let’s dig in:

 

Example Data

First, we have to create some example data:

set.seed(936757)                  # Create example data
data <- data.frame(x1 = runif(100),
                   x2 = runif(100))
head(data)                        # Head of example data
#          x1        x2
# 1 0.1455930 0.4151339
# 2 0.1524390 0.8134210
# 3 0.2395215 0.7935963
# 4 0.7515754 0.5283296
# 5 0.7779833 0.4289751
# 6 0.6350550 0.8990219

Have a look at the previous output of the RStudio console. It shows that our example data consists of two numeric columns x1 and x2.

 

Example 1: Scaling Data Frame Using scale() Function

The following R syntax shows how to standardize our example data using the scale function in R.

As you can see in the following R code, we simply have to insert the name of our data frame (i.e. data) into the scale function:

data_scale1 <- scale(data)        # Apply scale function
head(data_scale1)                 # Head of scaled data
#              x1           x2
# [1,] -1.2764344 -0.433002745
# [2,] -1.2523528  1.058137478
# [3,] -0.9460274  0.983916110
# [4,]  0.8551967 -0.009211208
# [5,]  0.9480902 -0.381182671
# [6,]  0.4453192  1.378617326

The previous output of the RStudio console shows the first six rows of our standardized data.

 

Example 2: Scaling Data Frame Using dplyr Package

Alternatively to the scale function we can also use functions of the dplyr add-on package. First, we need to install and load the dplyr package to RStudio:

install.packages("dplyr")         # Install & load dplyr package
library("dplyr")

Now, we can standardize our data frame using the dplyr package as shown below:

data_scale2 <- data %>%           # Applying functions of dplyr
  mutate_at(c("x1", "x2"), ~(scale(.) %>% as.vector))
head(data_scale2)                 # Head of scaled data
#           x1           x2
# 1 -1.2764344 -0.433002745
# 2 -1.2523528  1.058137478
# 3 -0.9460274  0.983916110
# 4  0.8551967 -0.009211208
# 5  0.9480902 -0.381182671
# 6  0.4453192  1.378617326

As you can see, the output is exactly the same as in Example 1.

 

Video & Further Resources

Have a look at the following video of my YouTube channel. I’m explaining the R programming code of this page in the video instruction:

 

The YouTube video will be added soon.

 

In addition, you might read the other tutorials of this website:

 

This article explained how to standardize data frame variables in the R programming language. Please let me know in the comments section, in case you have any further questions.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top