Speed Up Loop Using Rcpp Package in R (Example)

 

Do your loops in R take too long and cannot be vectorized? Then use the R package Rcpp to write C++ functions directly in R and speed up your code. If you are new to the Rcpp package, make sure to first take a look at our blog post about Rcpp.

In this blog post, we cover the following blocks:

Here is how!

 

Example 1: Simple Loop and Speed Comparison

When you work with loops in R and computation time is an issue: Vectorize! A real all-rounder in R for this are the apply functions. We have a blog post with examples of different apply functions like lapply, sapply, vapply, tapply & mapply.

However, often we use iterative algorithms where each iteration depends on the outcome of the iteration before. In those cases, vectorizing the iteration loop is not really possible. Luckily, we can speed up our loops with the Rcpp package which enables us to write and use C++ functions directly in our R code.

First, load the Rcpp package.

if (!require('Rcpp', quietly = TRUE)) { install.packages('Rcpp') } 
library('Rcpp') # Load package 'Rcpp'

We start with a very simple loop. Let us set \(result = 0\) and calculate \(result = cos(i – result)\) for \(i=1,\ldots,n\).

# Define an R function for the loop
r_loop <- function (I) {
  result <- 1
  for (i in 1:I) {
    result <- cos(i - result)
  }
  return(result)
}
 
# In C++, the same function is defined as
cppFunction(' 
 double cpp_loop(int I) {
   double result = 1;
   for (int i=1; i<=I; i++) {
     result = cos(i - result);
   }
   return result;
 }
')

We can check that both functions return the same output with an example.

r_loop(20)
# [1] 0.9883715
cpp_loop(20)
# [1] 0.9883715

To see whether we really speed up our code with C++, we can compare the computation time of both functions using the rbenchmark package.

if (!require('rbenchmark', quietly = TRUE)) { install.packages('rbenchmark') } 
library('rbenchmark') # Load package 'rbenchmark'
 
# Compare both functions for 500 evaluations
I = 10^5
benchmark(r_loop(I),
          cpp_loop(I),
          replications = 500)[,1:4]
 
#          test replications elapsed relative
# 2 cpp_loop(I)          500    2.55    1.000
# 1   r_loop(I)          500    5.25    2.059

Wow. Already for this small example loop, the C++ function is more than twice as fast as the R function. Imagine that you use iterative algorithms which you have to apply many times. Then, writing C++ functions via Rcpp can save you a lot of time.

 

Example 2: Loop Returning All Intermediate Results

As an R user, the biggest challenge for using Rcpp is to rewrite R code as C++ code. We will therefore make some examples for loops in R and C++ to get a better understanding of the C++ syntax. In this example, we consider the same loop as before, but now we want to return not only the result at the end, but also all intermediate results.

# Define an R function for the loop
r_loop2 <- function (I) {
  result <- vector(length = I)
  result[1] <- 1
  for (i in 2:I) {
    result[i] <- cos(i - result[i-1])
  }
  return(result)
}
 
# Install R package RcppArmadillo if not already installed.
if (!require('RcppArmadillo', quietly = TRUE)) { install.packages('RcppArmadillo') } 
 
# Define a C++ function for the loop
cppFunction(' 
 arma::vec cpp_loop2(int I) {
  arma::vec result(I);
  result[0] = 1;
    for (int i=1; i<=I; i++) {
     result[i] = cos((i+1) - result[i-1]);
   }
   return result;
 }
', 
            depends = c("RcppArmadillo"))

We make some comments on the C++ code. For writing C++ code, we often find it convenient to use the RcppArmadillo package. Armadillo is a C++ linear algebra library (by Conrad Sanderson) that strives for a good balance between speed and ease of use.

With arma::vec cpp_loop2(int I), we define a function called ‘cpp_loop2’ which takes as input an integer I and returns an Armadillo vector (we state that by the arma::vec before the function name). With Armadillo, we simply initiate a vector called ‘result’ of length I as arma::vec result(I);.

Check the output for I=4.

r_loop2(4)
# [1]  1.00000000  0.54030231 -0.77637979  0.06394714
cpp_loop2(4)
#             [,1]
# [1,]  1.00000000
# [2,]  0.54030231
# [3,] -0.77637979
# [4,]  0.06394714

Both functions return the same input for I=4. Note that in the R and C++ code we carefully consider the indices of vector result. In R, the first element of a vector result is indexed with result[1], in C++ it is indexed with result[0]. We see another difference. As a standard, R returns vectors as row-vectors, Armadillo vectors, on the other hand, are treated as column vectors, like matrices with only one column.

 

Example 3: Loop With Stopping Conditions

As another example, we implement a loop with a stopping condition. For iterative algorithms, usually we set a maximum number of iterations and use conditions indicating how much the parameters of interest change between the iterations. As an example, we take a look at the Geometric series given by \( \sum_{i=0}^{n} q^{i}\). In the Wikipedia article of the series, we read that for \(n \rightarrow \infty\) the series converges to \(1/(1-q)\) for values of \(|q|\) smaller 1. Let’s check whether this statement is plausible with a quick loop.

# Define an R function for the loop
r_loop3 <- function (maxiter = 500, eps = 10^(-10), q = 0.5) {
 
  # initialize values
  iter       <- 0
  result_old <- 10
  result_new <- 0
 
  # start while loop
  while (iter <= maxiter & abs(result_old - result_new) > eps ) {
    result_old <- result_new 
    result_new <- result_old + q^iter
    iter       <- iter + 1
  }
 
  cat(paste0("Stopped at iter = ", iter, " iterations\n"))
 
  # return results
  return(result_new)
}
 
 
# Define a C++ funtion for the loop
cppFunction(' 
 double cpp_loop3(int maxiter = 500, double eps = 10^(-10), double q = 0.5) {
 
  // initialize values
  int iter          = 0;
  double result_old = 10;
  double result_new = 0;
 
  // start while loop
  while (iter <= maxiter && std::abs(result_old - result_new) > eps) {
      result_old = result_new; 
      result_new = result_old + std::pow(q, iter);
      iter++;
  }
 
   Rcout << "Stopped at iter = " << iter << " iterations" << std::endl;
 
   // return output
   return result_new;
 }
')

Check whether our loops indicate that the statement that the series converges to \(1/(1-q)\) holds for \(q=0.5\).

q = 0.5
 
r_loop3(maxiter = 1000, eps = 10^(-10), q = q)
# Stopped at iter = 35 iterations
# [1] 2
 
cpp_loop3(maxiter = 1000, eps = 10^(-10), q = q)
# Stopped at iter = 35 iterations
# [1] 2
 
1 / (1-q)
# [1] 2

All give the same results. Nice! Now a task for you: Check it for other values of \(q\). Note that this is only an example, not a proof. But we can use a loop like above to check whether some statements are plausible or not.

 

Video & Further Resources

For further information on Rcpp, check out our introductory article and the Rcpp information on CRAN (especially the literature mentioned!).

You can also take a look at the following YouTube Video from the R Consortium channel, which gives an introduction to extending R with C++.

 

 

We also have the following blog posts on https://statisticsglobe.com/ that you might want to take a look at.

 

We showed you how to speed up your code using loops in C++ instead of R using the Rcpp package. This code can be useful to speed up different types of loops such as for-loops, while-loops, and repeat-loops. If you have comments or questions, use the comment section below.

 

Anna-Lena Wölwer Survey Statistician & R Programmer

This page was created in collaboration with Anna-Lena Wölwer. Have a look at Anna-Lena’s author page to get more information about her academic background and the other articles she has written for Statistics Globe.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top