# Proportions with dplyr Package in R (Example) | Create Relative Frequency Table

In this R tutorial you’ll learn how to **compute relative frequencies / proportions** with the dplyr package.

The article consists of the following contents:

- Creating Example Data
- Example: Get Relative Frequencies of Data Frame in R
- Video & Further Resources

Let’s dig in…

## Creating Example Data

The example data that we’ll use in this tutorial looks as follows:

set.seed(9876) # Create random example data data <- data.frame(x = sample(1:5, 100, replace = TRUE), y = sample(LETTERS[1:3], 100, replace = TRUE)) head(data) # Print 6 rows of example data # x y # 1 5 C # 2 2 A # 3 1 C # 4 3 B # 5 3 B # 6 2 C |

set.seed(9876) # Create random example data data <- data.frame(x = sample(1:5, 100, replace = TRUE), y = sample(LETTERS[1:3], 100, replace = TRUE)) head(data) # Print 6 rows of example data # x y # 1 5 C # 2 2 A # 3 1 C # 4 3 B # 5 3 B # 6 2 C

Our example data frame consists of 100 rows and two columns. The variable x contains the values 1, 2, 3, 4, and 5; and the variable y consists of the values A, B, and C.

Furthermore, we have to install and load the dplyr package:

install.packages("dplyr") # Install and load dplyr library("dplyr") |

install.packages("dplyr") # Install and load dplyr library("dplyr")

In the following example, we’ll create a table, representing the relative frequencies / proportions of our example data.

Keep on reading!

## Example: Get Relative Frequencies of Data Frame in R

In order to create a frequency table with the dplyr package, we can use a combination of the group_by, summarise, n, mutate, and sum functions. Have a look at the following R syntax:

data %>% # Create tibble with frequencies group_by(x, y) %>% summarise(n = n()) %>% mutate(freq = n / sum(n)) # # A tibble: 15 x 4 # # Groups: x [5] # x y n freq # <int> <fct> <int> <dbl> # 1 1 A 6 0.316 # 2 1 B 6 0.316 # 3 1 C 7 0.368 # 4 2 A 4 0.222 # 5 2 B 5 0.278 # 6 2 C 9 0.5 # 7 3 A 8 0.296 # 8 3 B 10 0.370 # 9 3 C 9 0.333 # 10 4 A 6 0.333 # 11 4 B 9 0.5 # 12 4 C 3 0.167 # 13 5 A 5 0.278 # 14 5 B 8 0.444 # 15 5 C 5 0.278 |

data %>% # Create tibble with frequencies group_by(x, y) %>% summarise(n = n()) %>% mutate(freq = n / sum(n)) # # A tibble: 15 x 4 # # Groups: x [5] # x y n freq # <int> <fct> <int> <dbl> # 1 1 A 6 0.316 # 2 1 B 6 0.316 # 3 1 C 7 0.368 # 4 2 A 4 0.222 # 5 2 B 5 0.278 # 6 2 C 9 0.5 # 7 3 A 8 0.296 # 8 3 B 10 0.370 # 9 3 C 9 0.333 # 10 4 A 6 0.333 # 11 4 B 9 0.5 # 12 4 C 3 0.167 # 13 5 A 5 0.278 # 14 5 B 8 0.444 # 15 5 C 5 0.278

As you can see based on the output of the RStudio console, the previous R code returned a tibble containing each possible combination of our two variables x and y as well as the count of each combination and the frequency of each combination. Note that the previous R code is based on this thread on Stack Overflow.

## Video & Further Resources

I have recently published a video on my YouTube channel, which explains the contents of this post. You can find the video below:

*The YouTube video will be added soon.*

In addition, you might read the other tutorials on statisticsglobe.com:

- mutate & transmute R Functions of dplyr Package
- sum Function in R
- Introduction to dplyr
- The R Programming Language

In summary: In this article, I illustrated how to **summarize categorical variables in a frequency / proportion table** with the dplyr package in R programming. If you have additional comments or questions, please let me know in the comments section. Besides that, don’t forget to subscribe to my email newsletter in order to get updates on new articles.

**5**/

**5**(

**1**vote )

### Subscribe to my free statistics newsletter: