Calculate Percentage by Group in R (2 Examples)
In this article, I’ll demonstrate how to get the percentage by group in R programming.
The post will consist of this:
Here’s the step-by-step process…
Creating Example Data
Have a look at the following example data:
data <- data.frame(group = rep(LETTERS[1:3], each = 4), # Create example data subgroup = letters[1:4], value = 1:12) data # Print example data
As you can see based on Table 1, our example data is a data frame containing twelve rows and three columns called “group”, “subgroup”, and “value”.
Example 1: Calculate Percentage by Group Using transform() Function
Have a look at the following R code:
data_new1 <- transform(data, # Calculate percentage by group perc = ave(value, group, FUN = prop.table)) data_new1 # Print updated data
As shown in Table 2, we have created a new data frame with a new column called perc. This column contains the percentages for each subgroup based on the value column.
Example 2: Calculate Percentage by Group Using group_by() & mutate() Functions of dplyr Package
Alternatively to Base R (as shown in Example 1), we can also use the functions of the dplyr package to calculate the percentages for each group.
To be able to use the functions of the dplyr package, we first need to install and load dplyr:
install.packages("dplyr") # Install & load dplyr package library("dplyr")
Next, we can apply the group_by, mutate, and sum functions to create a new data frame variable containing the percentages by group:
data_new2 <- data %>% # Calculate percentage by group group_by(group) %>% mutate(perc = value / sum(value)) %>% as.data.frame() data_new2 # Print updated data
The previous R code has created the same output as in Example 1. However, this time we have used the functions of the dplyr package.
Video, Further Resources & Summary
I have recently released a video on my YouTube channel, which illustrates the R programming codes of this article. You can find the video below.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
In addition, you may want to have a look at the other articles on this website.
- Select Top N Highest Values by Group
- Count Unique Values by Group in R
- Compute Summary Statistics by Group
- Count Number of Rows by Group Using dplyr Package
- R Programming Language
In summary: You have learned in this tutorial how to calculate the percentage by group in R. In case you have additional questions, let me know in the comments section.