Count Unique Values by Group in R (3 Examples)
In this post, I’ll show how to get the number of distinct values in each group of a data frame in the R programming language.
The article contains these topics:
Here’s the step-by-step process:
Creation of Example Data
We’ll use the data below as basement for this R tutorial:
data <- data.frame(x = c(1, 1, 2, 2, 2, 3, 3, 4, 5), # Create example data group = c("A", "A", "A", "B", "B", "C", "C", "C", "C")) data # Print example data # x group # 1 1 A # 2 1 A # 3 2 A # 4 2 B # 5 2 B # 6 3 C # 7 3 C # 8 4 C # 9 5 C
As you can see based on the previous output of the RStudio console, our example data has nine rows and two columns. The variable x contains different values and the variable group shows the grouping labels of our data.
The following examples show different ways on how to count unique values within each group of our data frame.
Example 1: Counting Unique Values by Group Using aggregate() Function of Base R
In Example 1, I’ll explain how to use the aggregate function that is contained in the basic installation of the R programming language to count unique values in each group of our data frame.
Consider the following R code:
data_count_1 <- aggregate(data = data, # Applying aggregate x ~ group, function(x) length(unique(x))) data_count_1 # Print counts # group x # 1 A 2 # 2 B 1 # 3 C 3
The output of the previous R programming code is a data frame containing one row for each group (i.e. A, B, and C).
The variable x in the previous output shows the number of unique values in each group (i.e. group A contains 2 unique values, group B contains 1 unique value, and group C contains 3 unique values).
Example 2: Counting Unique Values by Group Using group_by() & summarise Functions of dplyr Package
In this example, I’ll show how to use the dplyr package to count distinct values in each group.
We first have to install and load the dplyr package, if we want to use the corresponding functions:
install.packages("dplyr") # Install dplyr package library("dplyr") # Load dplyr
Now, we can apply the group_by and summarise functions of the dplyr package to return a tibble showing the group counts of our data:
data_count_2 <- data %>% # Applying group_by & summarise group_by(group) %>% summarise(count = n_distinct(x)) data_count_2 # Print counts # # A tibble: 3 x 2 # group count # <fct> <int> # 1 A 2 # 2 B 1 # 3 C 3
Example 3: Counting Unique Values by Group Using length() & unique() Functions & data.table Package
Another alternative for counting unique values by group is provided by the data.table package. First, we need to install and load the data.table package:
install.packages("data.table") # Install & load data.table library("data.table")
Now, we can use the following R syntax to create a data table that contains our group counts:
data_table <- data.table(data) # Applying data.table, length & unique data_count_3 <- data_table[ , .(count = length(unique(x))), by = group] data_count_3 # Print counts # group count # 1: A 2 # 2: B 1 # 3: C 3
Video & Further Resources
Do you need further explanations on the R programming syntax of this tutorial? Then you may watch the following video of my YouTube channel. In the video, I’m showing the R codes of this tutorial in a live session in RStudio.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Besides the video, you may want to have a look at the related articles of my website. Please find a selection of related articles about topics such as dplyr, numeric values, groups, and missing data below:
- Count Number of Cases within Each Group of Data Frame
- Count Number of Rows by Group Using dplyr Package
- Count NA Values in R
- Count Unique Values in R
- The R Programming Language
In summary: In this tutorial you learned how to return the amount of unique values by category and group in the R programming language. If you have any further questions, let me know in the comments.
Statistics Globe Newsletter