Merge List of pandas DataFrames in Python (Example)

 

This tutorial demonstrates how to join a list of multiple data sets using the pandas library in Python programming.

The post will contain this content:

Let’s dive right into the example.

 

Example Data & Software Libraries

We first have to import the pandas library, to be able to use the corresponding functions:

import pandas as pd                         # Load pandas

Next, let’s also create some example data:

data1 = pd.DataFrame({"ID":range(1, 6),     # Create first pandas DataFrame
                      "x1":range(101, 106),
                      "x2":["y", "x", "c", "v", "b"]})
data2 = pd.DataFrame({"ID":range(4, 9),     # Create second pandas DataFrame
                      "y1":["a", "y", "x", "a", "y"],
                      "y2":range(20, 10, - 2)})
data3 = pd.DataFrame({"ID":range(2, 10),    # Create third pandas DataFrame
                      "z1":range(21, 29),
                      "z2":range(10, 2, - 1)})

We can now store all of these pandas DataFrames in a list object:

data_list = [data1, data2, data3]           # Create list of DataFrames
print(data_list)                            # Print list of DataFrames
# [   ID   x1 x2
# 0   1  101  y
# 1   2  102  x
# 2   3  103  c
# 3   4  104  v
# 4   5  105  b,    ID y1  y2
# 0   4  a  20
# 1   5  y  18
# 2   6  x  16
# 3   7  a  14
# 4   8  y  12,    ID  z1  z2
# 0   2  21  10
# 1   3  22   9
# 2   4  23   8
# 3   5  24   7
# 4   6  25   6
# 5   7  26   5
# 6   8  27   4
# 7   9  28   3]

The previous console output illustrates the structure of our list. As you can see, it contains the values of our three example DataFrames.

 

Example: Merge List of Multiple pandas DataFrames

The following syntax shows how to merge a list of multiple pandas DataFrames in Python programming.

For this task, we also have to import the reduce function of the functools module:

from functools import reduce                # Import reduce function

Now, we can use the reduce function in combination with the merge function to join our three DataFrames in our list:

data_merge = reduce(lambda left, right:     # Merge DataFrames in list
                     pd.merge(left , right,
                              on = ["ID"],
                              how = "outer"),
                     data_list)
print(data_merge)                           # Print merged DataFrame

 

table 1 DataFrame merge list pandas dataframes python

 

As you can see based on Table 1, we have created a large pandas DataFrame consisting of nine rows and seven columns.

The values in this DataFrame correspond to the three input DataFrames that we have created in the beginning of this tutorial.

Note that we have used an outer join to combine our DataFrames in this specific example. However, it would also be possible to append and combine our DataFrames with different merge types such as inner joins, left joins, and right joins.

 

Video & Further Resources

If you need more explanations on the contents of this page, you could have a look at the following video tutorial on my YouTube channel. In the video tutorial, I’m explaining the contents of this page.

 

 

Furthermore, you might want to read some of the other articles on this website:

 

In this Python tutorial you have learned how to combine a list of multiple pandas DataFrames. If you have additional questions, don’t hesitate to let me know in the comments below.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top