# Merge Two Unequal Data Frames & Replace NA with 0 in R (Example)

In this tutorial, Iâ€™ll show how to join two unequal data frames and replace missing values by zero in R.

The page will consist of the following topics:

Itâ€™s time to dive into the R syntaxâ€¦

## Exemplifying Data

The data below is used as basement for this R programming language tutorial:

```data1 <- data.frame(id = 1:5,      # Create first data frame
x1 = 5:9,
x2 = 5:1)
data1                              # Print data
#   id x1 x2
# 1  1  5  5
# 2  2  6  4
# 3  3  7  3
# 4  4  8  2
# 5  5  9  1
data2 <- data.frame(id = 3:7,      # Create second data frame
y1 = 20:24,
y2 = 10:14)
data2                              # Print data
#   id y1 y2
# 1  3 20 10
# 2  4 21 11
# 3  5 22 12
# 4  6 23 13
# 5  7 24 14```

Have a look at the previous output of the RStudio console. It shows that our example data frames both consist of three columns, whereby each of them has an ID variable. However, you can also see that the IDs are not equal in the two data frames.

## Example: Merging Data & Replacing NA with Zero

In this Example, Iâ€™ll show how to combine two unequal data frames and how to replace occurring NA values with 0.

First, we are merging the two data frames together:

```data_all <- merge(data1, data2,    # Merge data
by = "id",
all = TRUE)
data_all                           # Print data
#   id x1 x2 y1 y2
# 1  1  5  5 NA NA
# 2  2  6  4 NA NA
# 3  3  7  3 20 10
# 4  4  8  2 21 11
# 5  5  9  1 22 12
# 6  6 NA NA 23 13
# 7  7 NA NA 24 14```

As you can see based on the previous output, we created a merge of our two input data sets. However, some of the cells of the merged data are NA. We can now replace these missing values with zero:

```data_all[is.na(data_all)] <- 0     # Replace NA with 0
data_all                           # Print data
#   id x1 x2 y1 y2
# 1  1  5  5  0  0
# 2  2  6  4  0  0
# 3  3  7  3 20 10
# 4  4  8  2 21 11
# 5  5  9  1 22 12
# 6  6  0  0 23 13
# 7  7  0  0 24 14```

Looks good! But note that such a replacement should only be done with theoretical justification. Otherwise the results created based on the merged data may be biased.

## Video, Further Resources & Summary

Would you like to know more about the merging of data frames? Then you might want to watch the following video which I have published on my YouTube channel. In the video, Iâ€™m illustrating the examples of this tutorial in RStudio.

In addition, you may want to have a look at some of the related articles of https://www.statisticsglobe.com/.

Summary: This tutorial showed how to merge and replace multiple data tables in the R programming language. Please tell me about it in the comments, in case you have additional questions.

Subscribe to the Statistics Globe Newsletter

• Hi, Joachim. You’re doing a great job. I have a question about merging data. I have two datasets with different sample names (IDs). Some variables are present in both datasets, others only in one, which means I have to merge by rows and columns. When I use this approach, I always get duplicate columns if the variable x is present in both datasets. Is there a solution how to merge such datasets?

• I think full_join should work.
Thanks anyway.

• Hey Leon,

Regards,
Joachim