Rank Functions of dplyr Package in R (row_number, ntile, min_rank, dense_rank, percent_rank & cume_dist)

 

In this tutorial, I’ll illustrate how to apply the rank functions of the dplyr package in the R programming language. The rank functions of dplyr are row_number, ntile, min_rank, dense_rank, percent_rank, and cume_dist.

The tutorial will consist of six examples, whereby each example explains one of the rank functions. To be more specific, the tutorial will contain the following:

If you want to know more about these contents, keep reading:

 

Example Data

In this R tutorial, we’ll apply the rank functions of the dplyr add-on package to the following example vector:

x <- c(4, 1, 5, 2, 3, 3)       # Create example vector

Our vector contains of six numeric values with a range from 1 to 5, whereby the value 3 appears twice.

Furthermore, we need to install and load the dplyr package to R:

install.packages("dplyr")      # Install & load dplyr
library("dplyr")

Now, we can move on to the examples.

 

Example 1: row_number Function

Example 1 explains how to use the row_number function in R. Have a look at the following R code:

row_number(x)                  # Apply row_number function
# 5 1 6 2 3 4

The row_number function returns the ranking of each value of our input vector. Note that the value 3, which is appearing twice, is also ranked. The second 3 is one ranking position higher than the first 3.

 

Example 2: ntile Function

In the second example, you’ll learn how to apply the ntile function. The ntile function is the only dplyr ranking function, which takes two arguments as input: the input vector (i.e. x) and an integer number (i.e. 3). The integer number is defining the number of groups to split up into.

ntile(x, 3)                    # Apply ntile function
# 3 1 3 1 2 2

The lowest two values of our input vector (i.e. 1 and 2) are assigned to group 1, the value 3 is assigned to group 2, and the highest two values of our input vector (i.e. 4 and 5) are assigned to group 3.

 

Example 3: min_rank Function

This example illustrates the usage of the min_rank function:

min_rank(x)                    # Apply min_rank function
# 5 1 6 2 3 3

The output of this function is the same as the output of the row_number command of Example 1, but this time doubling values (i.e. 3) lead to the same output value.

 

Example 4: dense_rank Function

Example 4 shows how to use the dense_rank function in R:

dense_rank(x)                  # Apply dense_rank function
# 4 1 5 2 3 3

The dense_rank function also returns the rank of our input vector to the RStudio console. In contrast to the min_rank function, dense_rank does not increase the rank for each vector element. Even though the value 3 appears twice, the next rank is only one number higher.

 

Example 5: percent_rank Function

In Example 5 we’ll apply the percent_rank function:

percent_rank(x)                # Apply percent_rank function
# 0.8 0.0 1.0 0.2 0.4 0.4

This R function converts the input vector into percentage ranks between 0 and 1.

 

Example 6: cume_dist Function

Finally, we apply the cume_dist function:

cume_dist(x)                   # Apply cume_dist function
# 0.8333333 0.1666667 1.0000000 0.3333333 0.6666667 0.6666667

The cume_dist function is a cumulative distribution function, which returns the proportion of all values less than or equal to the current rank.

 

Video & Further Resources

Do you need further information on the dplyr package? Then you may want to have a look at the following video of the Statistics Globe YouTube channel. In the video, I’m explaining the dplyr package in some more detail:

 

 

Furthermore, you might have a look at the related posts of my website.

 

In summary: In this R tutorial you learned how to windowed rank functions of the dplyr package. Don’t hesitate to let me know in the comments below, if you have any additional questions. Furthermore, please subscribe to my email newsletter for updates on the newest articles.

 



Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top