# Use lapply Function for data.table in R (4 Examples)

In this post, youâ€™ll learn how to **apply a function to multiple columns of a data.table** in R programming.

Table of contents:

Letâ€™s dive into itâ€¦

## Example Data & Software Packages

To be able to use the functions of the data.table package, we first need to install and load data.table:

install.packages("data.table") # Install data.table package library("data.table") # Load data.table

For more information on data.table, see our blog post here and the introductory post here.

The following data will be used as a basis for this R tutorial:

set.seed(9) data1 <- data.table( V1 = rnorm(500), # Generate data V2 = rnorm(500), V3 = rnorm(500), V4 = sample(LETTERS[1:10], 50, replace = TRUE), V5 = sample(month.abb[1:4], 125, replace = TRUE))

head(data1) # Print head of data

As you can see based on Table 1, our example data is a data table containing five columns.

For generating the data, we used some functions. For more information see our post on *set.seed()* here, *rnorm()* here, *sample()* here, *LETTERS[]* here, *month.abb[]* here.

## Example 1: Mean Values of Multiple Variables

The following R code illustrates how to apply the mean function to multiple columns of a data.table.

data1[ , lapply (.SD, mean), .SDcols = c("V1", "V2")] # Calculate mean values

The output of the previous R syntax is shown in Table 2: The mean values of columns *V1* and *V2*. *.SD* serves as a placeholder of those columns to which a function should be applied to.

## Example 2: Mean Values of Multiple Variables by Groups

In this example, Iâ€™ll illustrate how to calculate the mean values of multiple data rows by grouping variable *V5*, which is a grouping variable with four values.

data1[ , lapply (.SD, mean), by = .(V5), .SDcols = c("V1", "V2")] # Calculate group means

By executing the previously shown syntax, we have created Table 3. It shows the mean values of *V1* and *V2* per category of *V5*.

## Example 3: lapply with Self-Defined Function

In this section, Iâ€™ll demonstrate how to use a self-defined function with lapply. You can either define the function before lapply and use it similar to the mean function before. Alternatively, as shown below, you can also define the function within lapply.

data1[ , lapply (.SD, function (x) { x[3] * 5 }), .SDcols = c("V1", "V2")] # Apply self-defined function

Table 4 shows the output of the previous command.

## Example 4: Define New Variables with lapply

As a last example, we show how to define new data columns *V1_new* and *V2_new* within lapply.

data1 <- data1[ , c("V1_new", "V2_new") := lapply (.SD, function (x) { x^2 }), .SDcols = c("V1", "V2")] # Define new variables head(data1)

The output of the previous R code is shown in Table 5: The two new columns are added as the last columns.

## Video & Further Resources

I have recently published a video on the Statistics Globe YouTube channel, which explains the R codes of this post. You can find the video below.

*The YouTube video will be added soon.*

Furthermore, you may want to have a look at the related articles on my homepage. You can find some articles below.

- Convert data.table to Data Frame & Matrix in R (4 Examples)
- Compare Columns of data.table in R (5 Examples)
- Create Empty data.table with Column Names in R (2 Examples)
- Convert List to data.table in R (2 Examples)
- List of R Commands (Examples)
- All R Programming Tutorials

You have learned in this article how to **handle lapply with data.table** in R. In case you have any additional questions, tell me about it in the comments section.

This page was created in collaboration with Anna-Lena WÃ¶lwer. Have a look at Anna-Lenaâ€™s author page to get further information about her academic background and the other articles she has written for Statistics Globe.