Aggregate data.table by Group in R (2 Examples)

 

In this tutorial you’ll learn how to summarize a data.table by group in the R programming language.

The article will contain the following content blocks:

Let’s dive into it:

 

Example Data & Software Packages

To be able to use the functions of the data.table package, we first have to install and load data.table:

install.packages("data.table")                                 # Install data.table package
library("data.table")                                          # Load data.table

We create some example data:

data <- data.table(value = 1:6,                                # Create data.table
                   group = factor(letters[1:2]))
data                                                           # Print data.table

 

table 1 data frame aggregate data table group

 

Table 1 illustrates the output of the RStudio console that got returned by the previous syntax and shows the structure of our example data: It is made of six rows and two columns. The column value has the integer class and the variable group is a factor.

 

Example 1: Calculate Sum by Group in data.table

In this example, I’ll explain how to aggregate a data.table object. That is, summarizing its information by the entries of column group.

data_sum <- data[ , .(group_sum = sum(value)), by = group]     # Aggregate data
data_sum                                                       # Print sum by group

 

table 2 data frame aggregate data table group

 

As shown in Table 2, we have created a data.table object using the previous syntax. In the code, we declare that the group sums should be stored in a column called group_sum.

 

Example 2: Calculate Mean by Group in data.table

In Example 2, I’ll show how to calculate group means in a data.table object for each member of column group.

data_mean <- data[ , .(group_mean = mean(value)), by = group]  # Aggregate data
data_mean                                                      # Print mean by group

 

table 3 data frame aggregate data table group

 

As shown in Table 3, the previous R code has constructed a data.table object where for each category in column group the group mean of column value is stored in the new column group_mean.

 

Video, Further Resources & Summary

Do you want to know more about the aggregation of a data.table by group? Then I recommend having a look at the following video on my YouTube channel. In the video, I show the content of this tutorial:

 

The YouTube video will be added soon.

 

Besides the video, you may want to have a look at the related articles on Statistics Globe. You can find a selection of tutorials below:

 

In this tutorial you have learned how to aggregate a data.table by group in R. If you have any further questions, please let me know in the comments section.

 

Anna-Lena Wölwer Survey Statistician & R Programmer

This page was created in collaboration with Anna-Lena Wölwer. Have a look at Anna-Lena’s author page to get further information about her academic background and the other articles she has written for Statistics Globe.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

The maximum upload file size: 2 MB. You can upload: image. Drop file here

Top