Count Unique Values by Group in R (3 Examples)

In this post, I’ll show how to get the number of distinct values in each group of a data frame in the R programming language.

The article contains these topics:

1) Creation of Example Data

2) Example 1: Counting Unique Values by Group Using aggregate() Function of Base R

3) Example 2: Counting Unique Values by Group Using group_by() & summarise Functions of dplyr Package

4) Example 3: Counting Unique Values by Group Using length() & unique() Functions & data.table Package

5) Video & Further Resources

6) Subscribe to the Statistics Globe Newsletter

7) Thank you!

Here’s the step-by-step process:

Creation of Example Data

We’ll use the data below as basement for this R tutorial:

data <- data.frame(x = c(1, 1, 2, 2, 2, 3, 3, 4, 5),  # Create example data
                   group = c("A", "A", "A",
                             "B", "B",
                             "C", "C", "C", "C"))
data                                                  # Print example data
#   x group
# 1 1     A
# 2 1     A
# 3 2     A
# 4 2     B
# 5 2     B
# 6 3     C
# 7 3     C
# 8 4     C
# 9 5     C

As you can see based on the previous output of the RStudio console, our example data has nine rows and two columns. The variable x contains different values and the variable group shows the grouping labels of our data.

The following examples show different ways on how to count unique values within each group of our data frame.

Example 1: Counting Unique Values by Group Using aggregate() Function of Base R

In Example 1, I’ll explain how to use the aggregate function that is contained in the basic installation of the R programming language to count unique values in each group of our data frame.

Consider the following R code:

data_count_1 <- aggregate(data = data,                # Applying aggregate
                          x ~ group,
                          function(x) length(unique(x)))
data_count_1                                          # Print counts
#   group x
# 1     A 2
# 2     B 1
# 3     C 3

The output of the previous R programming code is a data frame containing one row for each group (i.e. A, B, and C).

The variable x in the previous output shows the number of unique values in each group (i.e. group A contains 2 unique values, group B contains 1 unique value, and group C contains 3 unique values).

Example 2: Counting Unique Values by Group Using group_by() & summarise Functions of dplyr Package

In this example, I’ll show how to use the dplyr package to count distinct values in each group.

We first have to install and load the dplyr package, if we want to use the corresponding functions:

install.packages("dplyr")                             # Install dplyr package
library("dplyr")                                      # Load dplyr

Now, we can apply the group_by and summarise functions of the dplyr package to return a tibble showing the group counts of our data:

data_count_2 <- data %>%                              # Applying group_by & summarise
  group_by(group) %>%
  summarise(count = n_distinct(x))
data_count_2                                          # Print counts
# # A tibble: 3 x 2
# group count
# <fct> <int>
# 1 A       2
# 2 B       1
# 3 C       3

Example 3: Counting Unique Values by Group Using length() & unique() Functions & data.table Package

Another alternative for counting unique values by group is provided by the data.table package. First, we need to install and load the data.table package:

install.packages("data.table")                        # Install & load data.table
library("data.table")

Now, we can use the following R syntax to create a data table that contains our group counts:

data_table <- data.table(data)                        # Applying data.table, length & unique
data_count_3 <- data_table[ , .(count = length(unique(x))), by = group]
data_count_3                                          # Print counts
#    group count
# 1:     A     2
# 2:     B     1
# 3:     C     3

Video & Further Resources

Do you need further explanations on the R programming syntax of this tutorial? Then you may watch the following video of my YouTube channel. In the video, I’m showing the R codes of this tutorial in a live session in RStudio.

Besides the video, you may want to have a look at the related articles of my website. Please find a selection of related articles about topics such as dplyr, numeric values, groups, and missing data below:

In summary: In this tutorial you learned how to return the amount of unique values by category and group in the R programming language. If you have any further questions, let me know in the comments.

2 Comments. Leave new

Jeff Norriss
June 29, 2023 8:04 am

Thanks Joachim.
example 1 is a little confusing because you have a column named x and a function named x.
At least that’s my understanding.
Cheers
jeff

Reply
- Cansu (Statistics Globe)
  July 3, 2023 9:37 am
  
  Hello Jeff,
  
  You are right. A different column name could be used. Thanks for the feedback.
  
  Best,
  Cansu
  
  Reply