# Combine Two ggplot2 Plots from Different Data Frames in R (Example)

In this article you’ll learn how to draw a ggplot2 plot based on several different data sources in the R programming language.

Let’s do this…

## Example Data, Add-On Packages & Default Plot

Consider the following example data:

```data1 <- data.frame(x = 1:5,        # Create first data frame
y = 1:5)
data1                               # Print first data frame
#   x y
# 1 1 1
# 2 2 2
# 3 3 3
# 4 4 4
# 5 5 5
data2 <- data.frame(x = 2:6,        # Create second data frame
y = 8:4)
data2                               # Print second data frame
#   x y
# 1 2 8
# 2 3 7
# 3 4 6
# 4 5 5
# 5 6 4```

The previous RStudio console output shows the structure of our example data sets – Both data frames contains two numeric columns with the variable names x and y.

If we want to use the functions of the ggplot2 package, we also have to install and load ggplot2:

```install.packages("ggplot2")         # Install ggplot2 package

Now, we can move on to the example…

## Example: Drawing ggplot2 Plot Based on Two Different Data Frames

This section shows how to use the ggplot2 package to draw a plot based on two different data sets.

For this, we have to set the data argument within the ggplot function to NULL. Then, we are specifying two geoms (i.e. geom_point and geom_line) and define the data set we want to use within each of those geoms.

```ggp <- ggplot(NULL, aes(x, y)) +    # Draw ggplot2 plot based on two data frames
geom_point(data = data1, col = "red") +
geom_line(data = data2, col = "blue")
ggp                                 # Draw plot``` Figure 1 visualizes the output of the previous R code – A ggplot2 graph created based on multiple different data matrices.

## Video, Further Resources & Summary

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party. If you accept this notice, your choice will be saved and the page will refresh.

Furthermore, I can recommend having a look at some of the other tutorials of this homepage:

Summary: In this article, I explained how to create a ggplot2 graph with two different data sets in the R programming language – a very nice method in case you want to add a new layer or series of data points to a ggplot2 plot. Let me know in the comments section, in case you have additional questions.

Subscribe to the Statistics Globe Newsletter

• Mauricio
January 14, 2021 3:21 pm

Hi, how are you?
I have 2 data frame with 3 columns each one (with 264.000 rows). The columns “Id1” (first) and “Id2” (second) are the same in both files (.tsv).
I tried to merge the columns Id1 and Id2 for two files but this result in an error: cannot allocate vector of size 972.2 Mb in R.
I wanna plot (scatter plot) the numerical values in the third columns for the two different files: Val1 (third column for file 1) VS Val_1 (third column for file 2).

• January 15, 2021 7:12 am

Hey Mauricio,

This sounds like a huge file! Do you really need all these data points or is it possible to make the file smaller before merging it?

You may have a look at this tutorial: https://statisticsglobe.com/r-error-cannot-allocate-vector-of-size-n-gb It gives some tips on how to handle the error message you are facing.

I hope that helps!

Joachim

• Sudipta Roy
March 2, 2021 1:23 pm

Hello,
I have a problem. I am working on two NetCDF files and I plotted them separately without having any problems. When I was combining them in one by following your method, “Error in FUN(X[[i]], …) : object ‘x’ not found” – this error keeps coming. What should I do? I want to plot from multiple NetCDF files in a single plot. Not to mention, spatial references, units and resolutions are same in the every files.

• March 2, 2021 2:17 pm

Hi Sudipta,

Could you post your R code? It’s difficult to tell the problem without seeing it.

Thanks

Joachim

• Sael
March 24, 2021 2:15 am

Could you explain how to add the legend to the figure?
I tried to add the legend but I could not do it.
Thanks

• March 24, 2021 6:41 am

Hi Sael,

Thanks a lot, glad the tutorial helped!

One solution would be to add the aes() function to both geoms, i.e.

```geom_point(data = data1, aes(col = "red")) +
geom_line(data = data2, aes(col = "blue"))```

Note that this changes the colors to the default color palette of the ggplot2 package.

Regards

Joachim

• Ramki
July 15, 2021 4:52 pm

I got blow error:
Error: Aesthetics must be either length 1 or the same as the data (5): y

I have used data1 having 5 rows and data2 having 8 rows, how to solve this issue. How can I plot data from two different files having different number of rows.

• August 2, 2021 10:01 am

Hey Ramki,

I’m sorry for the late response, I just came back from holidays and did not have the chance to read your message earlier.

Are you still looking for a solution to this problem?

Regards

Joachim

• pauloroberto
November 22, 2021 11:48 pm

Hello Joachim,

I have the same doubt that Ramki. I want plot using differents dataset having differents number of rows.

• November 23, 2021 8:01 am

Hey Paulo,

Thank you for bringing this question up again. It has inspired me to create a new tutorial focusing on this topic.

You can find the tutorial here: https://statisticsglobe.com/draw-two-data-sets-different-sizes-ggplot2-plot-r

Regards,
Joachim

• pauloroberto
November 23, 2021 3:18 pm

Thank you, Joachim. Helped me a lot. And as a free gift, you taught me to combine two data sets. It was just not to set a default data set in ggplot.

• November 23, 2021 3:36 pm

This is really great to hear Paulo, glad it helped! 🙂

• Silvana
March 24, 2022 7:08 pm

Hello Joachim, I need your help. Do you know how to comine two pltos froms differents data frames (two different models of the library effects) For example:
library (nlme)
library (effects)
m1<- lme(log(Ara_Tot+1) ~ FOREST+log(Ara_Tot_m+1)+log(Prey_ChinchFit_Lepidop+1), data= REDV, random = ~1|ANIO, method ="REML")
eff_Forest<- effect("FOREST",m1, partial.residuals=TRUE)
plot(eff_Forest, smooth.residuals=FALSE,lty=1,lines=list(col="black",lwd=2), residuals.color="black",band.colors="black",band.line ="black",lwd=3, main="Spiders, Before b. pod", ylab="% of forest")

m2<- lme(log(Ara_Tot+1) ~ FOREST+log(Ara_Tot_m+1), data= PANV, random = ~1|ANIO, method ="REML")
eff_Forest<- effect("FOREST",m2, partial.residuals=TRUE)
plot(eff_Forest, smooth.residuals=FALSE,lty=1,lines=list(col="black",lwd=2), residuals.color="black",band.colors="black",band.line ="black",lwd=3, main="Spiders, Before b. pod", ylab="% of forest")

I want to show in one graphic both variables of these two differents models.

Could you help me??

• March 25, 2022 7:22 am

Hey Silvana,

Are you looking for the points() function? Please have a look here for more details.

Regards,
Joachim

• Silvana
March 25, 2022 1:39 pm

Thank you Joachim! I wil try, but I think effects library cannot run with this option!
Best regards, Silvana

• March 25, 2022 5:30 pm