dplyr Package in R | Tutorial & Programming Examples

 

The dplyr R package provides many tools for the manipulation of data in R. The dplyr package is part of the tidyverse environment.

  • Here you can find the documentation of the dplyr package.
  • Here you can find the CRAN page of the dplyr package.

 

Examples for the dplyr Package

This section shows examples for some functions of the dplyr package. The examples are based on the following data frame:

data <- data.frame(x1 = 1:6,                # Create example data
                   x2 = c(1, 2, 2, 3, 1, 2),
                   x3 = c("F", "B", "C", "E", "A", "D"))
data                                        # Print example data
#   x1 x2 x3
# 1  1  1  F
# 2  2  2  B
# 3  3  2  C
# 4  4  3  E
# 5  5  1  A
# 6  6  2  D

If we want to apply the functions of dplyr, we need to install and load the dplyr package:

install.packages("dplyr")                   # Install dplyr package
library("dplyr")                            # Load dplyr package

Now, we are set up and can move on to the examples!

 

Example 1: arrange Function

The arrange function orders data sets according to a certain column of our data. Let’s use the arrange function to sort our data according to the variable x3:

arrange(data, x3)                           # Apply arrange function
#   x1 x2 x3
# 1  5  1  A
# 2  2  2  B
# 3  3  2  C
# 4  6  2  D
# 5  4  3  E
# 6  1  1  F

 

Example 2: filter Function

The filter function extracts rows of our data by a logical condition. The following R code creates a subset of our original data frame, in which only rows with the value 2 in the variable x2 are retained:

filter(data, x2 == 2)                       # Apply filter function
#   x1 x2 x3
# 1  2  2  B
# 2  3  2  C
# 3  6  2  D

 

Example 3: mutate Function

The mutate function transforms variables into new variables. With the following R syntax, we can create a new variable x4, which is containing the sums of each row of the variables x1 and x2:

mutate(data, x4 = x1 + x2)                  # Apply mutate function
#   x1 x2 x3 x4
# 1  1  1  F  2
# 2  2  2  B  4
# 3  3  2  C  5
# 4  4  3  E  7
# 5  5  1  A  6
# 6  6  2  D  8

 

Example 4: pull Function

The pull function extracts certain columns of our data frame and converts them into a vector. The following R code extracts the variable x2:

pull(data, x2)                              # Apply pull function
# 1 2 2 3 1 2

 

Example 5: rename Function

The rename function changes the name of certain columns. In this example, we’ll change the name of the third column from x3 to new_name:

rename(data, new_name = x3)                 # Apply rename function
#   x1 x2 new_name
# 1  1  1        F
# 2  2  2        B
# 3  3  2        C
# 4  4  3        E
# 5  5  1        A
# 6  6  2        D

 

Example 6: sample_n Function

The sample_n function randomly samples N cases from our data frame. The following R syntax sample three rows of our original data without replacement. Note that we are setting a seed for reproducibility in the forefront:

set.seed(765)                               # Set seed for reproducibility
sample_n(data, 3)                           # Apply sample_n function
#   x1 x2 x3
# 1  3  2  C
# 2  4  3  E
# 3  5  1  A

 

Example 7: select Function

The select function extracts certain columns from a data frame. The following R programming code creates a subset with the columns x2 and x3:

select(data, c(x2, x3))                     # Apply select function
#   x2 x3
# 1  1  F
# 2  2  B
# 3  2  C
# 4  3  E
# 5  1  A
# 6  2  D

Note that the previous R codes could also be applied to a tibble instead of a data frame. Furthermore, the pipe-operator (i.e. %>%) could be used. However, for simplicity I sticked to a basic R programming style.

 

Video: Introduction to the dplyr Package in R

The following video of the Statistics Globe YouTube channel explains some of the most important functions of the dplyr package:

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube Content Consent Button Thumbnail

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

 

Tutorials on the dplyr Package

You can find tutorials and examples for the dplyr package below.

 

 

Other Useful R Packages

In the following, you can find a list of other useful R packages.

 

 

Subscribe to my free statistics newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


2 Comments. Leave new

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top