Count Observations by Factor Level in R (3 Examples)

In this tutorial, I’ll show how to return the count of each category of a factor in R programming.

The tutorial will contain the following content:

1) Example Data

2) Example 1: Get Frequency of Categories Using table() Function

3) Example 2: Get Frequency of Categories Using count() Function of dplyr Package

4) Example 3: Get Frequency of Categories Using data.table Package

5) Video & Further Resources

Let’s dive right into the programming part:

Example Data

As a first step, I have to create some example data:

vec <- as.factor(c("A", "B", "A",      # Create example factor
                   "C", "C", "A",
                   "B", "C", "D"))
vec <- vec[- length(vec)]
vec                                    # Print example factor
# [1] A B A C C A B C
# Levels: A B C D

The previous output of the RStudio console shows the structure of our example data: It’s a factor vector consisting of eight vector elements.

Note that our factor has four different factor levels – A, B, C, and D. The factor level D is empty.

Let’s count the occurrences of each of the categories of our factor.

Example 1: Get Frequency of Categories Using table() Function

In this example, I’ll explain how to count the number of values per level in a given factor using the table function provided by the basic installation of the R programming language.

Have a look at the following R code and its output:

table(vec)                             # Applying table function
# vec
# A B C D 
# 3 2 3 0

As you can see, the output is a frequency table. The header of this table is identifying the four different factor levels of our categorical variable (i.e. A, B, C, and D). The first row of our frequency table shows how often each of these values appears in our data (i.e. A exists three times, B exists two times, C exists three times, and D exists zero times).

Looks good! However, the R programming language provides many add-on packages that are able to produce frequency tables and in the following examples I’ll explain two of those packages. So keep on reading!

Example 2: Get Frequency of Categories Using count() Function of dplyr Package

In this example, I’ll show how to use the dplyr package to count the number of observations by factor levels.

If we want to use the functions of the dplyr package, we first have to install and load dplyr:

install.packages("dplyr")              # Install dplyr package
library("dplyr")                       # Load dplyr

Furthermore, we have to convert our factor vector to a data.frame:

data_vec <- data.frame(vec)            # Create data frame
data_vec                               # Print data frame
#   vec
# 1   A
# 2   B
# 3   A
# 4   C
# 5   C
# 6   A
# 7   B
# 8   C

Now, we can apply the count function of the dplyr package to create a frequency table:

dplyr::count(data_vec, vec)            # Applying count function
#   vec       n
#   <fct> <int>
# 1 A         3
# 2 B         2
# 3 C         3

Note that the previous table doesn’t show empty categories, i.e. the empty factor level D is not shown.

Example 3: Get Frequency of Categories Using data.table Package

This example explains how to use the data.table package to count the number of cases in each category.

We first need to install and load the data.table package, if we want to use the functions and commands that are included in the package:

install.packages("data.table")         # Install & load data.table package
library("data.table")

Now, we can use the following R code to return a table with frequency counts:

setDT(data_vec)[ , .N, keyby = vec]    # Using data.table package
#    vec N
# 1:   A 3
# 2:   B 2
# 3:   C 3

Note that the data.table also doesn’t return the count of empty categories.

Video & Further Resources

Do you need further information on the content of this tutorial? Then I can recommend to watch the following video of my YouTube channel. In the video tutorial, I’m explaining the content of this tutorial in R.

Furthermore, you could read the other articles on this website. I have released numerous articles already:

R Programming Language

To summarize: At this point you should know how to get the frequency counts of factor vectors and variables in the R programming language. Don’t hesitate to let me know in the comments section below, in case you have further questions.

2 Comments. Leave new

Danly
August 27, 2022 11:04 pm

Thanks, this was helpful. Now I’m wondering how I can use these counts in a graph.

For example, let’s say I have a healthcare data set on patients with diabetes, with variables:
-DM (History of Diabetes, Factor, 0/1)
-Sex(M/F, Factor, 0/1),
-Vital_Status (Alive/Dead, Factor,0/1)

And I would like to create a bar graph with:
x=DM y/n
y=Count (n) of alive/dead
color=gender

Reply
- Joachim
  August 29, 2022 9:45 am
  Hey Danly,
  
  Thanks for the kind comment!
  
  Could you illustrate the structure of your data set in some more detail? What is returned when you execute the following R code:
  head(your_data)
  Regards,
  Joachim
  Reply