diff Function in R (2 Examples) | How to Calculate the Difference in R
In this article, I’ll explain how to calculate differences of a vector with the diff function in R. Let’s first have a look at the basic R syntax and the definition of diff:
Basic R Syntax of diff():
diff(x)
Definition of diff():
The diff function computes the difference between pairs of consecutive elements of a numeric vector.
In the following, I’ll show you two examples for the application of diff in the R programming language. So without further ado, let’s move on to the examples.
Example 1: diff Function With Default Specifications
The diff function is usually applied to a numeric vector, array, or column of a data frame. So let’s create such a vector first:
x <- c(5, 2, 10, 1, 3) # Create example vector
Our example vector contains five values between 1 and 3. Now let’s use the diff command to compute the difference of each consecutive value of this vector:
diff(x) # Apply diff in R # -3 8 -9 2
So what happened here? The diff function did four separate calculations:
- 2 – 5 = – 3
- 10 – 2 = 8
- 1 – 10 = – 9
- 3 – 1 = 2
The R diff function subtracted the first value from the second, the second value from the third, the third value from the fourth, and the fourth value from the fifth. In other words: diff returned the first lag to the RStudio console.
Check out the following YouTube video. I’m explaining this principle in the video:
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Could we also calculate a bigger lag? Yes of cause, and that’s what I’m going to show you next!
Example 2: diff Function With Lag Larger Than 1
The diff function provides the option “lag”. The default specification of this option is 1, as we have seen in Example 1. A perfect option in case we are dealing with time series data.
If we want to increase the size of the lag, we can specify the lag option within the diff command as follows:
diff(x, lag = 2) # Apply diff with lag # 5 -1 -7
In this example, we are using a lag of 2. In the following figure, you can see how this output is computed:
Figure 1: Calculations of diff Function with Lag of Two.
Alternative R Functions for the Calculation of Differences
The diff Function is by far not the only R function that computes differences of data objects. It makes a lot of sense to explore other difference-functions as well, to be able to decide from situation to situation which functions suits your need the most.
To give you some examples: I can recommend to have a look at functions such as difftime for the calculation of time differences; setdiff for the identification of elements of a data object A that are not existent in a data object B; or sweep which applies an operation such as minus to a data matrix by row or by column.
If you want to learn more about the computation of differences in R, you could also have a look at the following video tutorial of the YouTube channel Xperimental Learning. In the video, the speaker explains how to use the setdiff function. Have fun with the video and let me know in the comments which difference functions you like the most!
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Further Reading
Statistics Globe Newsletter
8 Comments. Leave new
thank you so much
You are very welcome Bao 🙂
How to use diff in panel data
Hey Jiran,
Could you illustrate how your data looks like and what you exactly want to achieve?
Regards,
Joachim
Does diff only work with even data frames ie df of row=672 ,col =2?
Hey,
The diff function is usually applied to vectors or the columns of matrices. It does not matter how long they are.
Regards,
Joachim
Hi Joachim,
Hope you are good. I have enrolled for a course in business analytics and I have subjects like Data Mining using R followed by econometrics later. Can you please help with some good courses where I can learn R from the basics and most of the functions covered especially for Vectors and Dataframes?
I have not done coding much, but I would like to give it a try. Please advice.
regards,
Shankar
Hey Shankar,
Thanks I’m good, and you?
You may have a look here for some recommended resources. Furthermore, you may have a look at the Data Camp courses, they are usually quite good as well.
Regards,
Joachim