# dplyr Package in R | Tutorial & Programming Examples

The dplyr R package provides many tools for the manipulation of data in R. The dplyr package is part of the tidyverse environment.

- Here you can find the documentation of the dplyr package.
- Here you can find the CRAN page of the dplyr package.

## Examples for the dplyr Package

This section shows examples for some functions of the dplyr package. The examples are based on the following data frame:

data <- data.frame(x1 = 1:6, # Create example data x2 = c(1, 2, 2, 3, 1, 2), x3 = c("F", "B", "C", "E", "A", "D")) data # Print example data # x1 x2 x3 # 1 1 1 F # 2 2 2 B # 3 3 2 C # 4 4 3 E # 5 5 1 A # 6 6 2 D

If we want to apply the functions of dplyr, we need to install and load the dplyr package:

install.packages("dplyr") # Install dplyr package library("dplyr") # Load dplyr package

Now, we are set up and can move on to the examples!

### Example 1: arrange Function

The arrange function orders data sets according to a certain column of our data. Let’s use the arrange function to sort our data according to the variable x3:

arrange(data, x3) # Apply arrange function # x1 x2 x3 # 1 5 1 A # 2 2 2 B # 3 3 2 C # 4 6 2 D # 5 4 3 E # 6 1 1 F

### Example 2: filter Function

The filter function extracts rows of our data by a logical condition. The following R code creates a subset of our original data frame, in which only rows with the value 2 in the variable x2 are retained:

filter(data, x2 == 2) # Apply filter function # x1 x2 x3 # 1 2 2 B # 2 3 2 C # 3 6 2 D

### Example 3: mutate Function

The mutate function transforms variables into new variables. With the following R syntax, we can create a new variable x4, which is containing the sums of each row of the variables x1 and x2:

mutate(data, x4 = x1 + x2) # Apply mutate function # x1 x2 x3 x4 # 1 1 1 F 2 # 2 2 2 B 4 # 3 3 2 C 5 # 4 4 3 E 7 # 5 5 1 A 6 # 6 6 2 D 8

### Example 4: pull Function

The pull function extracts certain columns of our data frame and converts them into a vector. The following R code extracts the variable x2:

pull(data, x2) # Apply pull function # 1 2 2 3 1 2

### Example 5: rename Function

The rename function changes the name of certain columns. In this example, we’ll change the name of the third column from x3 to new_name:

rename(data, new_name = x3) # Apply rename function # x1 x2 new_name # 1 1 1 F # 2 2 2 B # 3 3 2 C # 4 4 3 E # 5 5 1 A # 6 6 2 D

### Example 6: sample_n Function

The sample_n function randomly samples N cases from our data frame. The following R syntax sample three rows of our original data without replacement. Note that we are setting a seed for reproducibility in the forefront:

set.seed(765) # Set seed for reproducibility sample_n(data, 3) # Apply sample_n function # x1 x2 x3 # 1 3 2 C # 2 4 3 E # 3 5 1 A

### Example 7: select Function

The select function extracts certain columns from a data frame. The following R programming code creates a subset with the columns x2 and x3:

select(data, c(x2, x3)) # Apply select function # x2 x3 # 1 1 F # 2 2 B # 3 2 C # 4 3 E # 5 1 A # 6 2 D

Note that the previous R codes could also be applied to a tibble instead of a data frame. Furthermore, the pipe-operator (i.e. %>%) could be used. However, for simplicity I sticked to a basic R programming style.

### Video: Introduction to the dplyr Package in R

The following video of the Statistics Globe YouTube channel explains some of the most important functions of the dplyr package:

## Tutorials on the dplyr Package

You can find tutorials and examples for the dplyr package below.

## Other Useful R Packages

In the following, you can find a list of other useful R packages.

## 4 Comments. Leave new

This is a great single-point resource. Made me confident with dplyr. Thanks

Thank you Shrinivas, great to hear that you like this page! ðŸ™‚

Nice…One of the powerful package..

Absolutely, it’s definitely worth taking a closer look at the dplyr package! ðŸ™‚