Merge List of pandas DataFrames in Python (Example)
This tutorial demonstrates how to join a list of multiple data sets using the pandas library in Python programming.
The post will contain this content:
Let’s dive right into the example.
Example Data & Software Libraries
We first have to import the pandas library, to be able to use the corresponding functions:
import pandas as pd # Load pandas
Next, let’s also create some example data:
data1 = pd.DataFrame({"ID":range(1, 6), # Create first pandas DataFrame "x1":range(101, 106), "x2":["y", "x", "c", "v", "b"]})
data2 = pd.DataFrame({"ID":range(4, 9), # Create second pandas DataFrame "y1":["a", "y", "x", "a", "y"], "y2":range(20, 10, - 2)})
data3 = pd.DataFrame({"ID":range(2, 10), # Create third pandas DataFrame "z1":range(21, 29), "z2":range(10, 2, - 1)})
We can now store all of these pandas DataFrames in a list object:
data_list = [data1, data2, data3] # Create list of DataFrames print(data_list) # Print list of DataFrames # [ ID x1 x2 # 0 1 101 y # 1 2 102 x # 2 3 103 c # 3 4 104 v # 4 5 105 b, ID y1 y2 # 0 4 a 20 # 1 5 y 18 # 2 6 x 16 # 3 7 a 14 # 4 8 y 12, ID z1 z2 # 0 2 21 10 # 1 3 22 9 # 2 4 23 8 # 3 5 24 7 # 4 6 25 6 # 5 7 26 5 # 6 8 27 4 # 7 9 28 3]
The previous console output illustrates the structure of our list. As you can see, it contains the values of our three example DataFrames.
Example: Merge List of Multiple pandas DataFrames
The following syntax shows how to merge a list of multiple pandas DataFrames in Python programming.
For this task, we also have to import the reduce function of the functools module:
from functools import reduce # Import reduce function
Now, we can use the reduce function in combination with the merge function to join our three DataFrames in our list:
data_merge = reduce(lambda left, right: # Merge DataFrames in list pd.merge(left , right, on = ["ID"], how = "outer"), data_list) print(data_merge) # Print merged DataFrame
As you can see based on Table 1, we have created a large pandas DataFrame consisting of nine rows and seven columns.
The values in this DataFrame correspond to the three input DataFrames that we have created in the beginning of this tutorial.
Note that we have used an outer join to combine our DataFrames in this specific example. However, it would also be possible to append and combine our DataFrames with different merge types such as inner joins, left joins, and right joins.
Video & Further Resources
If you need more explanations on the contents of this page, you could have a look at the following video tutorial on my YouTube channel. In the video tutorial, I’m explaining the contents of this page.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Furthermore, you might want to read some of the other articles on this website:
- Types of Joins for pandas DataFrames in Python
- Add Multiple Columns to pandas DataFrame
- Add Column from Another pandas DataFrame
- rbind & cbind pandas DataFrame in Python
- Combine pandas DataFrames Vertically & Horizontally
- Merge pandas DataFrames based on Particular Column
- Merge pandas DataFrames based on Index
- Merge Multiple pandas DataFrames in Python
- Merge Two pandas DataFrames in Python
- Combine pandas DataFrames with Different Column Names
- Combine pandas DataFrames with Same Column Names
- Append Multiple pandas DataFrames in Python
- Append pandas DataFrame in Python
- Get pandas DataFrame Column as List in Python
- Get Column Names of pandas DataFrame as List in Python
- DataFrame Manipulation Using pandas in Python
- Basic Course for the pandas Library in Python
- Python Programming Examples
In this Python tutorial you have learned how to combine a list of multiple pandas DataFrames. If you have additional questions, don’t hesitate to let me know in the comments below.