Count Unique Values by Group in R (3 Examples)

 

In this post, I’ll show how to get the number of distinct values in each group of a data frame in the R programming language.

The article contains these topics:

Here’s the step-by-step process:

 

Creation of Example Data

We’ll use the data below as basement for this R tutorial:

data <- data.frame(x = c(1, 1, 2, 2, 2, 3, 3, 4, 5),  # Create example data
                   group = c("A", "A", "A",
                             "B", "B",
                             "C", "C", "C", "C"))
data                                                  # Print example data
#   x group
# 1 1     A
# 2 1     A
# 3 2     A
# 4 2     B
# 5 2     B
# 6 3     C
# 7 3     C
# 8 4     C
# 9 5     C

As you can see based on the previous output of the RStudio console, our example data has nine rows and two columns. The variable x contains different values and the variable group shows the grouping labels of our data.

The following examples show different ways on how to count unique values within each group of our data frame.

 

Example 1: Counting Unique Values by Group Using aggregate() Function of Base R

In Example 1, I’ll explain how to use the aggregate function that is contained in the basic installation of the R programming language to count unique values in each group of our data frame.

Consider the following R code:

data_count_1 <- aggregate(data = data,                # Applying aggregate
                          x ~ group,
                          function(x) length(unique(x)))
data_count_1                                          # Print counts
#   group x
# 1     A 2
# 2     B 1
# 3     C 3

The output of the previous R programming code is a data frame containing one row for each group (i.e. A, B, and C).

The variable x in the previous output shows the number of unique values in each group (i.e. group A contains 2 unique values, group B contains 1 unique value, and group C contains 3 unique values).

 

Example 2: Counting Unique Values by Group Using group_by() & summarise Functions of dplyr Package

In this example, I’ll show how to use the dplyr package to count distinct values in each group.

We first have to install and load the dplyr package, if we want to use the corresponding functions:

install.packages("dplyr")                             # Install dplyr package
library("dplyr")                                      # Load dplyr

Now, we can apply the group_by and summarise functions of the dplyr package to return a tibble showing the group counts of our data:

data_count_2 <- data %>%                              # Applying group_by & summarise
  group_by(group) %>%
  summarise(count = n_distinct(x))
data_count_2                                          # Print counts
# # A tibble: 3 x 2
# group count
# <fct> <int>
# 1 A       2
# 2 B       1
# 3 C       3

 

Example 3: Counting Unique Values by Group Using length() & unique() Functions & data.table Package

Another alternative for counting unique values by group is provided by the data.table package. First, we need to install and load the data.table package:

install.packages("data.table")                        # Install & load data.table
library("data.table")

Now, we can use the following R syntax to create a data table that contains our group counts:

data_table <- data.table(data)                        # Applying data.table, length & unique
data_count_3 <- data_table[ , .(count = length(unique(x))), by = group]
data_count_3                                          # Print counts
#    group count
# 1:     A     2
# 2:     B     1
# 3:     C     3

 

Video & Further Resources

Do you need further explanations on the R programming syntax of this tutorial? Then you may watch the following video of my YouTube channel. In the video, I’m showing the R codes of this tutorial in a live session in RStudio.

 

The YouTube video will be added soon.

 

Besides the video, you may want to have a look at the related articles of my website. Please find a selection of related articles about topics such as dplyr, numeric values, groups, and missing data below:

 

In summary: In this tutorial you learned how to return the amount of unique values by group in the R programming language. If you have any further questions, let me know in the comments.

 

Subscribe to my free statistics newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top