Standardize Data Frame Columns in R (2 Examples) | scale Function

 

In this tutorial, I’ll illustrate how to scale a data frame in R programming.

Table of contents:

Let’s dig in:

 

Example Data

First, we have to create some example data:

set.seed(936757)                  # Create example data
data <- data.frame(x1 = runif(100),
                   x2 = runif(100))
head(data)                        # Head of example data
#          x1        x2
# 1 0.1455930 0.4151339
# 2 0.1524390 0.8134210
# 3 0.2395215 0.7935963
# 4 0.7515754 0.5283296
# 5 0.7779833 0.4289751
# 6 0.6350550 0.8990219

Have a look at the previous output of the RStudio console. It shows that our example data consists of two numeric columns x1 and x2.

 

Example 1: Scaling Data Frame Using scale() Function

The following R syntax shows how to standardize our example data using the scale function in R.

As you can see in the following R code, we simply have to insert the name of our data frame (i.e. data) into the scale function:

data_scale1 <- scale(data)        # Apply scale function
head(data_scale1)                 # Head of scaled data
#              x1           x2
# [1,] -1.2764344 -0.433002745
# [2,] -1.2523528  1.058137478
# [3,] -0.9460274  0.983916110
# [4,]  0.8551967 -0.009211208
# [5,]  0.9480902 -0.381182671
# [6,]  0.4453192  1.378617326

The previous output of the RStudio console shows the first six rows of our standardized data.

 

Example 2: Scaling Data Frame Using dplyr Package

Alternatively to the scale function we can also use functions of the dplyr add-on package. First, we need to install and load the dplyr package to RStudio:

install.packages("dplyr")         # Install & load dplyr package
library("dplyr")

Now, we can standardize our data frame using the dplyr package as shown below:

data_scale2 <- data %>%           # Applying functions of dplyr
  mutate_at(c("x1", "x2"), ~(scale(.) %>% as.vector))
head(data_scale2)                 # Head of scaled data
#           x1           x2
# 1 -1.2764344 -0.433002745
# 2 -1.2523528  1.058137478
# 3 -0.9460274  0.983916110
# 4  0.8551967 -0.009211208
# 5  0.9480902 -0.381182671
# 6  0.4453192  1.378617326

As you can see, the output is exactly the same as in Example 1.

 

Video & Further Resources

Have a look at the following video of my YouTube channel. I’m explaining the R programming code of this page in the video instruction:

 

The YouTube video will be added soon.

 

In addition, you might read the other tutorials of this website:

 

This article explained how to standardize data frame variables in the R programming language. Please let me know in the comments section, in case you have any further questions.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

The maximum upload file size: 2 MB. You can upload: image. Drop file here

Top