Combine Columns to Remove NA Values in R (2 Examples)
This tutorial shows how to merge data frame columns and remove NAs in R programming.
The article contains these contents:
You’re here for the answer, so let’s get straight to the R code…
Creating Example Data
We’ll use the following data as a basement for this R programming language tutorial:
data <- data.frame(x1 = c("a", "b", NA, NA, NA, "f"), # Create example data frame x2 = c(NA, NA, "c", "d", NA, NA), x3 = c(NA, NA, NA, NA, "e", NA), y = 11:16, z = 6:1) data # Print example data frame
Table 1 shows that our example data contains six rows and five columns. The variables x1, x2, and x3 are character strings and the variables y and z are integers.
The columns x1-x3 contain NA values. The following examples show how to combine these columns to remove the NAs.
Let’s do this!
Example 1: Join Columns to Delete NA Values Using cbind(), na.omit() & unlist() Functions
This example shows how to use the functions of the basic installation of the R programming language to combine our data frame columns.
However, as a first step, we have to specify the column names that we want to combine:
col_combine <- c("x1", "x2", "x3") # Specify columns to combine col_combine # Print column names #  "x1" "x2" "x3"
data_new1 <- cbind(x = na.omit(unlist(data[ , col_combine])), # Combine certain columns data[ , ! colnames(data) %in% col_combine, drop = FALSE]) data_new1 # Print data frame with combined columns
Table 2 shows the output of the previous R syntax – A new data frame where the columns x1-x3 have been combined and the NA values have been removed.
Example 2: Join Columns to Delete NA Values Using dplyr & purrr Packages
First, we have to install and load the two packages:
install.packages("dplyr") # Install dplyr package library("dplyr") # Load dplyr
install.packages("purrr") # Install purrr package library("purrr") # Load purrr package
Now, we can use the functions of those two packages to create a new data frame with combined columns:
data_new2 <- data %>% # Combine certain columns dplyr::mutate(x = invoke(coalesce, across(all_of(col_combine)))) %>% dplyr::select(x, colnames(data)[! colnames(data) %in% col_combine]) data_new2 # Print data frame with combined columns
Table 3 shows the output of the previous R programming syntax: The variables x1-x3 have been joined and the NA values have been deleted.
Video, Further Resources & Summary
In case you need further info on the R codes of this tutorial, I recommend having a look at the following video on my YouTube channel. I illustrate the examples of this article in the video:
In addition, you might have a look at the other articles on this homepage. Please find a selection of articles about topics such as graphics in R, vectors, missing data, and ggplot2 below:
- Remove or Show NA Values in Table
- Remove NA Values from ggplot2 Plot
- Remove NA Values from Vector
- The R Programming Language
In summary: In this article, I have illustrated how to combine data frame columns and remove NA values in the R programming language. Kindly let me know in the comments section, in case you have additional questions.