R Merge Multiple Data Frames in List (2 Examples) | Base R vs. tidyverse

 

On this page you’ll learn how to simultaneously merge multiple data frames in a list in the R programming language.

The tutorial will contain two examples or more precisely these exact contents:

Let’s do this!

 

Exemplifying Data

Before we can start with the merging, we need to create some example data. Let’s first create three data frames in R…

data1 <- data.frame(id = 1:6,                                  # Create first example data frame
                    x1 = c(5, 1, 4, 9, 1, 2),
                    x2 = c("A", "Y", "G", "F", "G", "Y"))
 
data2 <- data.frame(id = 4:9,                                  # Create second example data frame
                    y1 = c(3, 3, 4, 1, 2, 9),
                    y2 = c("a", "x", "a", "x", "a", "x"))
 
data3 <- data.frame(id = 5:6,                                  # Create third example data frame
                    z1 = c(3, 2),
                    z2 = c("K", "b"))

…and then let’s store these data frames in a list:

data_list <- list(data1, data2, data3)                         # Combine data frames to list

 

Example 1: Merge List of Multiple Data Frames with Base R

If we want to merge a list of data frames with Base R, we need to perform two steps.

First, we need to create our own merging function. Note that we have to specify the column based on which we want to join our data within this function (i.e. “id”):

my_merge <- function(df1, df2){                                # Create own merging function
  merge(df1, df2, by = "id")
}

Then, we need to apply the Reduce function to our own function:

Reduce(my_merge, data_list)                                    # Apply Reduce to own function

 

merge multiple data frames of list by column names

Table 1: Three Merged Data Frames of List.

 

Table 1 shows the result of the merging process. Note that the previous R code conducted an inner join. However, we could also specify a right, left, or full join within our user defined function. Learn more on joining data with different join types here (merge function) and here (dplyr functions).

You think the previous code was a bit complicated? Then you may prefer the code of the next example. So keep reading…

 

Example 2: Merge List of Multiple Data Frames with tidyverse

Example 1 relied on the basic installation of R (or RStudio). However, the tidyverse add-on package provides a very smooth and simple solution for combining multiple data frames in a list simultaneously. Let’s install and load the tidyverse packages (to be precise – we need the dplyr and the purrr packages for the following example):

install.packages("tidyverse")                                  # Install tidyverse package
library("tidyverse")                                           # Load tidyverse package

Now, we can use the reduce function of the tidyverse (note the lower case r) in order to join our multiple data sets in one line of R syntax:

data_list %>% reduce(inner_join, by = "id")                    # Apply reduce function of tidyverse

Much easier than Base R if you ask me, but that’s probably a matter of taste 🙂

 

Video & Further Resources

Please have a close look at the following video of my YouTube channel. I illustrate the contents that I have shown in this R tutorial in the video in more detail.

 

The YouTube video will be added soon.

 

In addition to the video, I can recommend to read some of the other articles on the Statistics Globe internet page.

 

On this page you learned how to merge multiple data frames using base R and the tidyverse in R. However, please do not hesitate to tell me about it in the comments section, in case you have any further comments or questions.

 



Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top