dist Function in R (4 Examples) | Compute Euclidean & Manhattan Distance

 

This article illustrates how to compute distance matrices using the dist function in R.

The article will consist of four examples for the application of the dist function. More precisely, the article will contain this information:

So now the part you have been waiting for – the exemplifying R code.

Definition & Basic R Syntax of dist Function

 

Definition: The dist R function computes distance matrices.

 

Basic R Syntax: You can find the basic R programming syntax of the dist function below.

dist(x)                             # Basic R syntax of dist function

In the remaining post, I’ll illustrate in four examples how to use the dist function in R programming.

 

Creation of Example Data

As a first step, we’ll have to define some example data.

set.seed(98274)                     # Create random matrix
data <- matrix(runif(40), nrow = 5)
data                                # Print data matrix
#           [,1]      [,2]      [,3]      [,4]      [,5]       [,6]       [,7]      [,8]
# [1,] 0.7118180 0.5109773 0.5873743 0.5141773 0.0405815 0.02276534 0.07710989 0.4853844
# [2,] 0.4936169 0.3623344 0.4664132 0.8704115 0.4307836 0.30439913 0.45256307 0.7631912
# [3,] 0.8001764 0.9941515 0.9678332 0.5046720 0.9333047 0.90857556 0.04199172 0.4390112
# [4,] 0.4271324 0.9435018 0.2224725 0.4434530 0.6212999 0.23565634 0.75857683 0.2018782
# [5,] 0.2947591 0.4652785 0.2969466 0.4901909 0.8220045 0.70756700 0.94728656 0.1381266

The previous output of the RStudio console shows that our exemplifying data is a numeric matrix with five rows and eight columns.

 

Example 1: Compute Euclidean Distance Using Default Specifications of dist() Function

In Example 1, I’ll illustrate how to use the dist() function to calculate a distance matrix of our example data in R. For this task, we simply need to insert our matrix into the dist function:

dist(data)                          # Apply dist
#           1         2         3         4
# 2 0.8129931                              
# 3 1.4039595 1.3302545                    
# 4 1.1548195 1.0167178 1.3494188          
# 5 1.4894028 1.0837360 1.3958275 0.7460621

Have a look at the output of the RStudio console. It shows the distances of each combination of our data rows.

Note that the dist function computes the Euclidean Distance by default. However, it is also possible to use other distance metrics…

 

Example 2: Compute Manhattan Distance Using dist Function & method Argument

The following syntax explains how to create a matrix showing the Manhattan Distances of our data table using the method argument:

dist(data, method = "manhattan")    # Apply dist & method argument
#          1        2        3        4
# 2 2.169135                           
# 3 2.821522 3.646985                  
# 4 2.911419 2.445137 3.168916         
# 5 3.460831 2.765666 3.238146 1.656885

Note that the dist command provides many different distance measures, including the Euclidean, Maximum, Manhattan, Canberra, Binary, and Minkowski distances.

 

Example 3: Compute Distance & Diagonal of Distance Matrix

In this Example, I’ll explain how to create a distance matrix that does also contain a diagonal. For this, we have to specify the diag argument to be equal to TRUE:

dist(data, diag = TRUE)             # Apply dist & diag argument
#           1         2         3         4         5
# 1 0.0000000                                        
# 2 0.8129931 0.0000000                              
# 3 1.4039595 1.3302545 0.0000000                    
# 4 1.1548195 1.0167178 1.3494188 0.0000000          
# 5 1.4894028 1.0837360 1.3958275 0.7460621 0.0000000

 

Example 4: Compute Distance & Upper Triangle of Distance Matrix

Next, I’ll explain how to draw a distance matrix with lower and upper triangle using the upper argument of the dist function:

dist(data, upper = TRUE)            # Apply dist & Upper argument
#           1         2         3         4         5
# 1           0.8129931 1.4039595 1.1548195 1.4894028
# 2 0.8129931           1.3302545 1.0167178 1.0837360
# 3 1.4039595 1.3302545           1.3494188 1.3958275
# 4 1.1548195 1.0167178 1.3494188           0.7460621
# 5 1.4894028 1.0837360 1.3958275 0.7460621

 

Video & Further Resources

I have recently published a video on my YouTube channel, which shows the R code of this tutorial. You can find the video below:

 

 

Additionally, you could have a look at the other articles of my website:

 

This tutorial showed how to apply the dist function in the R programming language. Let me know in the comments section, if you have any further questions. Furthermore, please subscribe to my email newsletter to get updates on the newest articles.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


2 Comments. Leave new

  • Hi I was just following your tutorial on the distance function within R, for the sample data am I able to run this code on data I already have? If so do I import it as a dataset and work from there?

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top