Combine Columns to Remove NA Values in R (2 Examples)
This tutorial shows how to merge data frame columns and remove NAs in R programming.
The article contains these contents:
You’re here for the answer, so let’s get straight to the R code…
Creating Example Data
We’ll use the following data as a basement for this R programming language tutorial:
data <- data.frame(x1 = c("a", "b", NA, NA, NA, "f"), # Create example data frame x2 = c(NA, NA, "c", "d", NA, NA), x3 = c(NA, NA, NA, NA, "e", NA), y = 11:16, z = 6:1) data # Print example data frame
Table 1 shows that our example data contains six rows and five columns. The variables x1, x2, and x3 are character strings and the variables y and z are integers.
The columns x1-x3 contain NA values. The following examples show how to combine these columns to remove the NAs.
Let’s do this!
Example 1: Join Columns to Delete NA Values Using cbind(), na.omit() & unlist() Functions
This example shows how to use the functions of the basic installation of the R programming language to combine our data frame columns.
However, as a first step, we have to specify the column names that we want to combine:
col_combine <- c("x1", "x2", "x3") # Specify columns to combine col_combine # Print column names # [1] "x1" "x2" "x3"
Next, we can apply the cbind, na.omit, unlist, and colnames functions to create a new data frame with merged columns without NA values:
data_new1 <- cbind(x = na.omit(unlist(data[ , col_combine])), # Combine certain columns data[ , ! colnames(data) %in% col_combine, drop = FALSE]) data_new1 # Print data frame with combined columns
Table 2 shows the output of the previous R syntax – A new data frame where the columns x1-x3 have been combined and the NA values have been removed.
Example 2: Join Columns to Delete NA Values Using dplyr & purrr Packages
The following code illustrates how to use the functions of the dplyr and purrr packages to combine data frame columns to drop NAs.
First, we have to install and load the two packages:
install.packages("dplyr") # Install dplyr package library("dplyr") # Load dplyr
install.packages("purrr") # Install purrr package library("purrr") # Load purrr package
Now, we can use the functions of those two packages to create a new data frame with combined columns:
data_new2 <- data %>% # Combine certain columns dplyr::mutate(x = invoke(coalesce, across(all_of(col_combine)))) %>% dplyr::select(x, colnames(data)[! colnames(data) %in% col_combine]) data_new2 # Print data frame with combined columns
Table 3 shows the output of the previous R programming syntax: The variables x1-x3 have been joined and the NA values have been deleted.
Video, Further Resources & Summary
In case you need further info on the R codes of this tutorial, I recommend having a look at the following video on my YouTube channel. I illustrate the examples of this article in the video:
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
In addition, you might have a look at the other articles on this homepage. Please find a selection of articles about topics such as graphics in R, vectors, missing data, and ggplot2 below:
- Remove or Show NA Values in Table
- Remove NA Values from ggplot2 Plot
- Remove NA Values from Vector
- The R Programming Language
In summary: In this article, I have illustrated how to combine data frame columns and remove NA values in the R programming language. Kindly let me know in the comments section, in case you have additional questions.
Statistics Globe Newsletter