Introduction to the Rcpp Package in R (Examples)

 

C++ is an object-oriented programming language, just like R. And if you thought R is fast: Be sure, C++ is mostly faster. Luckily, with the R package Rcpp we can write and use C++ functions directly in R to speed up our code. In this post, we show you how!

We cover the topics:

Let’s speed things up a little!

 

C++ and Rcpp

C++ is another object-oriented programming language. In many cases, it can be useful to write functions in C++ instead of R to make our R code significantly faster. So what is Rcpp? Rcpp is an R package which enables seamless integration of C++ into R.

When you work with R, you probably already worked with the Rcpp package without knowing it. On the CRAN page of the Rcpp package, you see its reverse dependencies and suggestions. That is, all the packages which use Rcpp. The list is huuuge and ever growing. For example, you might recognize the lme4 package for (generalized) linear mixed-effects models.

The use of C++ via Rcpp is especially useful when

  • you work with loops which cannot be vectorized by using functions like apply instead.
  • you work with iterative algorithms where certain function evaluations have to be calculated very often.
  • an algorithm requires demanding mathematical operations.

 

Define a C++ Function within R

To define a C++ function in R, we first load the Rcpp package.

if (!require('Rcpp', quietly = TRUE)) { install.packages('Rcpp') } 
library('Rcpp') # Load package 'Rcpp'

As a start, let’s make a Statistics Globe version of the ‘Hello World’ function.

cppFunction(' 
 std::string this_is_a_cpp_function() {
   std::string output = "Statistics Globe shows me Rcpp";
   return output;
}
')

Everything within function cppFunction(‘ ‘) is compiled as C++ code and therefore has to be written in the C++ syntax. C++ code looks somewhat familiar when you worked with R before, but it is still some work to learn it.

Let’s take apart the C++ code above to understand it better. In the above code, we define a function called ‘this_is_a_cpp_function’. After defining it, it can be used within R just like any other R function.

# Use the C++ function in R
 
this_is_a_cpp_function()
# [1] "Statistics Globe shows me Rcpp"

To have a better understanding of the C++ function defined above, we write it down again, only this time with extra comments. In C++, we indicate comments by ‘//’ instead of ‘#’.

# We are in an R script and define a C++ function with cppFunction(' ')
 
cppFunction(' 
 
 // Define function "this_is_a_cpp_function" which 
 // returns a string and takes no input
 std::string this_is_a_cpp_function() {
 
   // Define a string variable called "output" 
   // and set its value to "Statistics Globe shows me Rcpp"
   std::string output = "Statistics Globe shows me Rcpp";
 
   // Return variable output.
   // We defined function "this_is_a_cpp_function" to 
   // return a string. Therefore, we can only 
   // return a string variable here.
   return output;
 }
 
')

As an R user and beginner of Rcpp, you want to focus on three main features of the C++ language:

  • Semicolons
  • Data types
  • Indices

Semicolons! We have to end every code line with a semicolon (which – frankly – is easily forgotten by R users).

We have to define the data type of all objects and the data type a function returns. This helps C++ to efficiently store and operate on all objects. This extra trouble is rewarded with all the more efficiency!

In the above code, we only use datatype string which is part of the std namespace and therefore called via std::string. A string object in C++ is like a character object in R. Other data types include bool (boolean), int (integer) or double (double floating numbers).

An easy to forget feature of C++ code for R users: Indices start with 0, not 1! When you want to index the first element of an object, in R you use object[1], whereas in C++ you use object[0]. The following code gives you an example.

cppFunction(' 
 std::string this_is_a_second_cpp_function() {
   std::string output = "Test";
   Rcout << "Indices start with 0!" << std::endl;
   Rcout << "output = " << output << std::endl;
   Rcout << "output[0] = " << output[0] << std::endl;
   return output;
 }
')
 
this_is_a_second_cpp_function()
# Indices start with 0!
# output = Test
# output[0] = T
# [1] "Test"

When you write C++ code, the above used Rcout commands are very useful. You can use them to print certain information when conducting a function, similar to base R functions cat() and print().

 

Compare the Computation Time of C++ and R Code

Well, so far learning a new programming language like C++ for use in R seems to be quite a burden. With a small example, we show you why taking a little trouble learning C++ can actually save you a lot of time.

As an example, let’s calculate a simple loop function, where we set \(output_0 = 0\) and calculate \(output_i = sinus(i + output_{i-1})\), \(i=1,\ldots,n\).

# Define a C++ function for the loop
cppFunction(' 
 double loop_fun_cpp(int n) {
   double output = 0;
   for (int i=1; i<=n; i++) {
     output = sin(i + output);
   }
   return output;
 }
')
 
# Define an R function for the loop
loop_fun_r <- function (n) {
  output <- 0
  for (i in 1:n) {
    output <- sin(i + output)
  }
  return(output)
}

Both functions return the same output.

loop_fun_cpp(10)
# [1] -0.03349829
loop_fun_r(10)
# [1] -0.03349829

You can compare the computation time of the two functions using the rbenchmark package.

if (!require('rbenchmark', quietly = TRUE)) { install.packages('rbenchmark') } 
library('rbenchmark') # Load package 'rbenchmark'
 
# Compare both functions for 100 evaluations
n = 10^6
 
benchmark(loop_fun_cpp(n),
          loop_fun_r(n),
          replications = 100)[,1:4]
 
#              test replications elapsed relative
# 1 loop_fun_cpp(n)          100    4.98    1.000
# 2   loop_fun_r(n)          100   10.64    2.137

Already for this small example, the C++ function is twice as fast as the R function! When you work with computationally demanding operations or have to call a function very often, using C++ via the Rcpp package can make a big difference for the efficiency of your code!

 

Practical Application of Rcpp

When writing code for a project, you would typically evaluate which functions or algorithms are so costly that you want to define them as C++ instead of R functions.

When you only have very few C++ functions, it is practical to use Rcpp to define C++ functions within your R code. As an Rcpp example, within R code we define C++ function fun1 which calculates the quotient of two numbers via the following lines.

cppFunction(' 
 double fun1(double a, double b) {
   double output = a / b ;
   return output;
 }
')
 
fun1(4, 5)
# [1] 0.8

For a project, you typically have several C++ functions. In that case, it makes sense to store your C++ functions in extra scripts. When you use RStudio, just create another script and save it with ending .cpp instead of .R. The new .cpp script automatically considers the C++ syntax highlighting and makes it easier for you to write C++ code! The .cpp script for the above function can be written as follows.

#include <Rcpp.h>
using namespace Rcpp;
 
// [[Rcpp::export]]
double fun2(double a, double b) {
  double output = a / b ;
  return output;
}

Take the first 2 code lines as given and always include them to make sure that we use the Rcpp package and namespace.

The line with // [[Rcpp::export]] is not a comment! It declares that when we source this code, the function below // [[Rcpp::export]] becomes part of the global environment and is thereby available for working in R.

As a test, store the above code as ‘test.cpp’. You can source this script in R with the following code.

# Load package 'Rcpp'
if (!require('Rcpp', quietly = TRUE)) { install.packages('Rcpp') } 
library('Rcpp') 
 
# Source your C++ code in file 'test.cpp'
sourceCpp('test.cpp') 
 
# In your global environment, you should now see function 'fun2'
# which was defined in 'test.cpp'.
# You can use this function in R.
fun2(4, 5)            
# [1] 0.8

 

Video & Further Resources

To learn more about C++ coding and Rcpp, we recommend you to take a look at the package description on CRAN, which includes links for the Rcpp documentation. Especially, taking a look at the paper and book on Rcpp: Seamless R and C++ Integration with Rcpp is highly recommended! Also, there is a github page of the Rcpp package.

On his YouTube channel, the author of the article and book, Dirk Eddelbuettel, published a tutorial/webinar on the use of Rcpp which you might want to look into.

 

 

You may also be interested in the following articles on Statistics Globe:

 

In this article, we introduced the R package Rcpp which enables you to significantly speed up your R code by integrating C++ code.

 

Anna-Lena Wölwer Survey Statistician & R Programmer

This page was created in collaboration with Anna-Lena Wölwer. Have a look at Anna-Lena’s author page to get more information about her academic background and the other articles she has written for Statistics Globe.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


2 Comments. Leave new

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top