Create Lagged Variable by Group in R (Example)
In this R programming tutorial you’ll learn how to add a column with lagged values by group to a data frame.
The content is structured as follows:
You’re here for the answer, so let’s get straight to the exemplifying R code:
Introduction of Example Data
The first step is to create some data that we can use in the examples later on:
data <- data.frame(group = c(rep(LETTERS[1:3], # Create example data each = 3), "C"), values = 11:20) data # Print example data
Have a look at the table that has been returned after running the previous R syntax. It shows that our exemplifying data is composed of ten rows and two columns.
The variable group defines the different groups of our data and the variable values contains corresponding values.
Example: Create Lagged Variable by Group Using dplyr Package
First, we need to install and load the dplyr package:
install.packages("dplyr") # Install dplyr package library("dplyr") # Load dplyr
data_dplyr <- data %>% # Add lagged column group_by(group) %>% dplyr::mutate(lag1 = dplyr::lag(values, n = 1, default = NA)) %>% as.data.frame() data_dplyr # Print updated data
In Table 2 it is shown that we have created a new data frame with a new variable called lag1.
Please note that we have specified the name of the dplyr package in front of the mutate and lag functions, because functions with the same name are also contained in other R add-on packages.
Also note that we have converted the output of the dplyr functions to the data.frame class by using the as.data.frame function. You may remove this line of code in case you prefer to return a tibble instead of a data frame.
Video & Further Resources
I have recently released a video on my YouTube channel, which shows the R programming codes of this article. You can find the video below.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Besides that, you might have a look at the related tutorials of this website.
- lead & lag R Functions of dplyr Package
- Use Previous Row of data.table in R
- Convert Data Frame with Date Column to Time Series Object
- R Programming Examples
You have learned in this tutorial how to create a lagged version of a variable by group in the R programming language. This is a very common task when dealing with time series data. In case you have additional questions, let me know in the comments section.
Statistics Globe Newsletter