Speed Up Loop Using Rcpp Package in R (Example)
Do your loops in R take too long and cannot be vectorized? Then use the R package Rcpp to write C++ functions directly in R and speed up your code. If you are new to the Rcpp package, make sure to first take a look at our blog post about Rcpp.
In this blog post, we cover the following blocks:
Here is how!
Example 1: Simple Loop and Speed Comparison
When you work with loops in R and computation time is an issue: Vectorize! A real all-rounder in R for this are the apply functions. We have a blog post with examples of different apply functions like lapply, sapply, vapply, tapply & mapply.
However, often we use iterative algorithms where each iteration depends on the outcome of the iteration before. In those cases, vectorizing the iteration loop is not really possible. Luckily, we can speed up our loops with the Rcpp package which enables us to write and use C++ functions directly in our R code.
First, load the Rcpp package.
if (!require('Rcpp', quietly = TRUE)) { install.packages('Rcpp') } library('Rcpp') # Load package 'Rcpp'
We start with a very simple loop. Let us set \(result = 0\) and calculate \(result = cos(i – result)\) for \(i=1,\ldots,n\).
# Define an R function for the loop r_loop <- function (I) { result <- 1 for (i in 1:I) { result <- cos(i - result) } return(result) } # In C++, the same function is defined as cppFunction(' double cpp_loop(int I) { double result = 1; for (int i=1; i<=I; i++) { result = cos(i - result); } return result; } ')
We can check that both functions return the same output with an example.
r_loop(20) # [1] 0.9883715 cpp_loop(20) # [1] 0.9883715
To see whether we really speed up our code with C++, we can compare the computation time of both functions using the rbenchmark package.
if (!require('rbenchmark', quietly = TRUE)) { install.packages('rbenchmark') } library('rbenchmark') # Load package 'rbenchmark' # Compare both functions for 500 evaluations I = 10^5 benchmark(r_loop(I), cpp_loop(I), replications = 500)[,1:4] # test replications elapsed relative # 2 cpp_loop(I) 500 2.55 1.000 # 1 r_loop(I) 500 5.25 2.059
Wow. Already for this small example loop, the C++ function is more than twice as fast as the R function. Imagine that you use iterative algorithms which you have to apply many times. Then, writing C++ functions via Rcpp can save you a lot of time.
Example 2: Loop Returning All Intermediate Results
As an R user, the biggest challenge for using Rcpp is to rewrite R code as C++ code. We will therefore make some examples for loops in R and C++ to get a better understanding of the C++ syntax. In this example, we consider the same loop as before, but now we want to return not only the result at the end, but also all intermediate results.
# Define an R function for the loop r_loop2 <- function (I) { result <- vector(length = I) result[1] <- 1 for (i in 2:I) { result[i] <- cos(i - result[i-1]) } return(result) } # Install R package RcppArmadillo if not already installed. if (!require('RcppArmadillo', quietly = TRUE)) { install.packages('RcppArmadillo') } # Define a C++ function for the loop cppFunction(' arma::vec cpp_loop2(int I) { arma::vec result(I); result[0] = 1; for (int i=1; i<=I; i++) { result[i] = cos((i+1) - result[i-1]); } return result; } ', depends = c("RcppArmadillo"))
We make some comments on the C++ code. For writing C++ code, we often find it convenient to use the RcppArmadillo package. Armadillo is a C++ linear algebra library (by Conrad Sanderson) that strives for a good balance between speed and ease of use.
With arma::vec cpp_loop2(int I), we define a function called ‘cpp_loop2’ which takes as input an integer I and returns an Armadillo vector (we state that by the arma::vec before the function name). With Armadillo, we simply initiate a vector called ‘result’ of length I as arma::vec result(I);.
Check the output for I=4.
r_loop2(4) # [1] 1.00000000 0.54030231 -0.77637979 0.06394714 cpp_loop2(4) # [,1] # [1,] 1.00000000 # [2,] 0.54030231 # [3,] -0.77637979 # [4,] 0.06394714
Both functions return the same input for I=4. Note that in the R and C++ code we carefully consider the indices of vector result. In R, the first element of a vector result is indexed with result[1], in C++ it is indexed with result[0]. We see another difference. As a standard, R returns vectors as row-vectors, Armadillo vectors, on the other hand, are treated as column vectors, like matrices with only one column.
Example 3: Loop With Stopping Conditions
As another example, we implement a loop with a stopping condition. For iterative algorithms, usually we set a maximum number of iterations and use conditions indicating how much the parameters of interest change between the iterations. As an example, we take a look at the Geometric series given by \( \sum_{i=0}^{n} q^{i}\). In the Wikipedia article of the series, we read that for \(n \rightarrow \infty\) the series converges to \(1/(1-q)\) for values of \(|q|\) smaller 1. Let’s check whether this statement is plausible with a quick loop.
# Define an R function for the loop r_loop3 <- function (maxiter = 500, eps = 10^(-10), q = 0.5) { # initialize values iter <- 0 result_old <- 10 result_new <- 0 # start while loop while (iter <= maxiter & abs(result_old - result_new) > eps ) { result_old <- result_new result_new <- result_old + q^iter iter <- iter + 1 } cat(paste0("Stopped at iter = ", iter, " iterations\n")) # return results return(result_new) } # Define a C++ funtion for the loop cppFunction(' double cpp_loop3(int maxiter = 500, double eps = 10^(-10), double q = 0.5) { // initialize values int iter = 0; double result_old = 10; double result_new = 0; // start while loop while (iter <= maxiter && std::abs(result_old - result_new) > eps) { result_old = result_new; result_new = result_old + std::pow(q, iter); iter++; } Rcout << "Stopped at iter = " << iter << " iterations" << std::endl; // return output return result_new; } ')
Check whether our loops indicate that the statement that the series converges to \(1/(1-q)\) holds for \(q=0.5\).
q = 0.5 r_loop3(maxiter = 1000, eps = 10^(-10), q = q) # Stopped at iter = 35 iterations # [1] 2 cpp_loop3(maxiter = 1000, eps = 10^(-10), q = q) # Stopped at iter = 35 iterations # [1] 2 1 / (1-q) # [1] 2
All give the same results. Nice! Now a task for you: Check it for other values of \(q\). Note that this is only an example, not a proof. But we can use a loop like above to check whether some statements are plausible or not.
Video & Further Resources
For further information on Rcpp, check out our introductory article and the Rcpp information on CRAN (especially the literature mentioned!).
You can also take a look at the following YouTube Video from the R Consortium channel, which gives an introduction to extending R with C++.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
We also have the following blog posts on https://statisticsglobe.com/ that you might want to take a look at.
- Introduction to the Rcpp Package in R
- R programming Overview
- Loops in R
- RcppArmadillo Package in R (Example)
We showed you how to speed up your code using loops in C++ instead of R using the Rcpp package. This code can be useful to speed up different types of loops such as for-loops, while-loops, and repeat-loops. If you have comments or questions, use the comment section below.
This page was created in collaboration with Anna-Lena Wölwer. Have a look at Anna-Lena’s author page to get more information about her academic background and the other articles she has written for Statistics Globe.
Statistics Globe Newsletter