# sample_n & sample_frac R Functions | Sample Data with dplyr Package

This article shows how to take a sample of a data set with the sample_n and sample_frac functions of the dplyr package in the R programming language.

The post is structured as follows:

Let’s start right away:

## Creating Example Data

In the examples of this R tutorial, we’ll use the following data frame as basement:

```data <- data.frame(x1 = c(1, 2, 1, 3, 2, 3), # Create example data x2 = c("A", "A", "C", "A", "B", "C")) data # Print example data # x1 x2 # 1 1 A # 2 2 A # 3 1 C # 4 3 A # 5 2 B # 6 3 C```

Our data contains six rows and two columns. Note that we could also use a tibble instead of a data frame.

In order to make the sample_n and sample_frac functions of the dplyr package available, we need to install and load the package to RStudio:

```install.packages("dplyr") # Install dplyr library("dplyr") # Load dplyr```

Since we are going to randomly sample data, it also makes sense to set a seed for reproducibility:

`set.seed(15151) # Set seed`

Now we are set up and can move on to the application of the sample_n and sample_frac functions…

## Example 1: Sampling N Cases with sample_n Function

Example 1 shows how to apply the sample_n function. The sample_n function returns a sample with a certain sample size of our original data frame.

Let’s assume that we want to extract a subsample of three cases. Then, we can apply the sample_n command as follows:

```sample_n(data, 3) # Apply sample_n # x1 x2 # 1 3 C # 2 2 A # 3 1 C```

The previous RStudio console output shows the result – A subset of our data frame with three rows.

## Example 2: Sampling Fraction of Data with sample_frac Function

In contrast to sample_n, the sample_frac function is sampling a fraction (i.e. percentage) of the input data frame. For instance, we can sample a fraction of 33% with the following R code:

```sample_frac(data, 0.33) # Apply sample_frac # x1 x2 # 1 2 A # 2 3 C```

Since 33% of six rows is two, the sample_frac function retains two rows of our original data.

## Video & Further Resources

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party. If you accept this notice, your choice will be saved and the page will refresh.

Furthermore, I can recommend to have a look at the other RStudio tutorials of this homepage. A selection of tutorials about the dplyr package and the sampling of data in R can be found here.