Summarize Multiple Columns of data.table by Group in R (Example)
In this R tutorial you’ll learn how to group the variables of a data.table.
The tutorial will contain these contents:
Here’s how to do it…
Example Data & Packages
At the start, we’ll have to construct some exemplifying data with the data.table class:
To be able to use the functions of the data.table package, we have to install and load data.table first:
install.packages("data.table") # Install data.table package library("data.table") # Load data.table
Next, we can create a data.table using the data.table() function as shown below:
data <- data.table(x1 = 1:12, # Create example data.table x2 = 11:22, group = rep(letters[1:3], each = 4)) data # Print example data.table
Table 1 shows that the exemplifying data.table contains twelve rows and three columns.
Example: Group Multiple Variables Using data.table Package
In this example, I’ll explain how to summarize multiple columns of a data.table by group to create descriptive statistics of our data.
For this, we have to use lapply and .SD as shown below.
data_group <- data[, lapply(.SD, mean), by = group] # Summarize by group data_group # Print summarized data.table
As shown in Table 2, the previous code has created a data.table showing the mean in each variable of each group.
Video, Further Resources & Summary
Would you like to know more about data.tables in R? Then you may want to have a look at the following video of my YouTube channel. I show the examples of this tutorial in the video.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Furthermore, you may have a look at the other tutorials on my website. You can find some other tutorials about topics such as variables and dplyr below.
- Sort Data Frame by Multiple Columns in R
- Split Data Frame Variable into Multiple Columns
- Drop Multiple Columns from Data Frame Using dplyr Package
- R Programming Language
To summarize: This page has illustrated how to summarize the variables of a data.table by groups in the R programming language. In case you have any additional questions, let me know in the comments section below.
Statistics Globe Newsletter