# sample_n & sample_frac R Functions | Sample Data with dplyr Package

This article shows how to take a sample of a data set with the sample_n and sample_frac functions of the dplyr package in the R programming language.

The post is structured as follows:

Let’s start right away:

## Creating Example Data

In the examples of this R tutorial, we’ll use the following data frame as basement:

```data <- data.frame(x1 = c(1, 2, 1, 3, 2, 3),              # Create example data
x2 = c("A", "A", "C", "A", "B", "C"))
data                                                      # Print example data
#   x1 x2
# 1  1  A
# 2  2  A
# 3  1  C
# 4  3  A
# 5  2  B
# 6  3  C```

Our data contains six rows and two columns. Note that we could also use a tibble instead of a data frame.

In order to make the sample_n and sample_frac functions of the dplyr package available, we need to install and load the package to RStudio:

```install.packages("dplyr")                                 # Install dplyr

Since we are going to randomly sample data, it also makes sense to set a seed for reproducibility:

`set.seed(15151)                                           # Set seed`

Now we are set up and can move on to the application of the sample_n and sample_frac functions…

## Example 1: Sampling N Cases with sample_n Function

Example 1 shows how to apply the sample_n function. The sample_n function returns a sample with a certain sample size of our original data frame.

Let’s assume that we want to extract a subsample of three cases. Then, we can apply the sample_n command as follows:

```sample_n(data, 3)                                         # Apply sample_n
#   x1 x2
# 1  3  C
# 2  2  A
# 3  1  C```

The previous RStudio console output shows the result – A subset of our data frame with three rows.

## Example 2: Sampling Fraction of Data with sample_frac Function

In contrast to sample_n, the sample_frac function is sampling a fraction (i.e. percentage) of the input data frame. For instance, we can sample a fraction of 33% with the following R code:

```sample_frac(data, 0.33)                                   # Apply sample_frac
#   x1 x2
# 1  2  A
# 2  3  C```

Since 33% of six rows is two, the sample_frac function retains two rows of our original data.

## Video & Further Resources

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party. If you accept this notice, your choice will be saved and the page will refresh.

Furthermore, I can recommend to have a look at the other RStudio tutorials of this homepage. A selection of tutorials about the dplyr package and the sampling of data in R can be found here.