# Proportions with dplyr Package in R (Example) | Create Relative Frequency Table

In this R tutorial you’ll learn how to compute relative frequencies / proportions with the dplyr package.

The article consists of the following contents:

Let’s dig in…

## Creating Example Data

The example data that we’ll use in this tutorial looks as follows:

```set.seed(9876) # Create random example data data <- data.frame(x = sample(1:5, 100, replace = TRUE), y = sample(LETTERS[1:3], 100, replace = TRUE)) head(data) # Print 6 rows of example data # x y # 1 5 C # 2 2 A # 3 1 C # 4 3 B # 5 3 B # 6 2 C```

Our example data frame consists of 100 rows and two columns. The variable x contains the values 1, 2, 3, 4, and 5; and the variable y consists of the values A, B, and C.

Furthermore, we have to install and load the dplyr package:

```install.packages("dplyr") # Install and load dplyr library("dplyr")```

In the following example, we’ll create a table, representing the relative frequencies / proportions of our example data.

## Example: Get Relative Frequencies of Data Frame in R

In order to create a frequency table with the dplyr package, we can use a combination of the group_by, summarise, n, mutate, and sum functions. Have a look at the following R syntax:

```data %>% # Create tibble with frequencies group_by(x, y) %>% summarise(n = n()) %>% mutate(freq = n / sum(n)) # # A tibble: 15 x 4 # # Groups: x [5] # x y n freq # <int> <fct> <int> <dbl> # 1 1 A 6 0.316 # 2 1 B 6 0.316 # 3 1 C 7 0.368 # 4 2 A 4 0.222 # 5 2 B 5 0.278 # 6 2 C 9 0.5 # 7 3 A 8 0.296 # 8 3 B 10 0.370 # 9 3 C 9 0.333 # 10 4 A 6 0.333 # 11 4 B 9 0.5 # 12 4 C 3 0.167 # 13 5 A 5 0.278 # 14 5 B 8 0.444 # 15 5 C 5 0.278```

As you can see based on the output of the RStudio console, the previous R code returned a tibble containing each possible combination of our two variables x and y as well as the count of each combination and the frequency of each combination. Note that the previous R code is based on this thread on Stack Overflow.

## Video & Further Resources

I have recently published a video on my YouTube channel, which explains the contents of this post. You can find the video below:

In summary: In this article, I illustrated how to summarize categorical variables in a frequency / proportion table with the dplyr package in R programming. If you have additional comments or questions, please let me know in the comments section. Besides that, don’t forget to subscribe to my email newsletter in order to get updates on new articles.

• can you make graph with the frequency table?

• Hi Gisla,

Thank you for your comment. Could you specify in some more detail how such a graph should look like?

Regards,

Joachim

• Hi Joachim,

your script alows to get a printed table in the consol window. I would like that the new columns (n and freq) are added to my initial table (x and y). This way the rsults are stored and can be use for making plots.
How I could do that?
Bijgom

• Hi Bijgom,

Thank you for the comment!

Unfortunately, it is not possible to add n and freq to the initial table, since n and freq have a different length than the number of rows of the initial table (i.e. 15 vs. 100).

However, you may store the results of the frequency table in a new table by adding data_freq <- in front of the R code:

```data_freq <- data %>% group_by(x, y) %>% summarise(n = n()) %>% mutate(freq = n / sum(n))```

I hope that helps!

Joachim