Split Data Frame Variable into Multiple Columns in R (3 Examples) | Separate Character String Based on Delimiter
In this tutorial, I’ll illustrate how to separate one variable of a data frame into multiple columns in R.
The article contains the following:
- Creation of Example Data
- Example 1: Split Column with Base R
- Example 2: Split Column with stringr Package
- Example 3: Split Column with tidyr Package
- Video & Further Resources
Let’s get started.
Creation of Example Data
In the examples of this R programming tutorial, we’ll use the following data frame as basement:
data <- data.frame(x = c("a-1", "b-2", "c-3")) data # x # 1 a-1 # 2 b-2 # 3 c-3
Our example data consists of one column. This column is a character string and each value of this column is separated with a – sign.
In the following examples, I’ll show how to split this column into multiple columns based on the delimiter “-“. So keep on reading.
Example 1: Split Column with Base R
The basic installation of R provides a solution for the splitting of variables based on a delimiter. If we want to split our variable with Base R, we can use a combination of the data.frame, do.call, rbind, strsplit, and as.character functions. Have a look at the following R code:
data.frame(do.call("rbind", strsplit(as.character(data$x), "-", fixed = TRUE))) # X1 X2 # 1 a 1 # 2 b 2 # 3 c 3
As you can see based on the output of the RStudio console, we created a new data frame consisting of two columns X1 and X2. The first column contains the values before the separator “-” and the second column contains the values at the right side of the delimiter “-“.
Looks good, but you may find the previous R code kind of complicated (I agree). Fortunately, the R programming language provides add-on packages that allow for the splitting of variables with simpler R code. You guessed it – That’s what I’m going to show you next!
Example 2: Split Column with stringr Package
…and then we can use the str_split_fixed command to split our column into two columns:
str_split_fixed(data$x, "-", 2) # [,1] [,2] # [1,] "a" "1" # [2,] "b" "2" # [3,] "c" "3"
The output of the function is a matrix consisting of two columns with the same values as shown in Example 1. However, this time the R code was much cleaner.
Example 3: Split Column with tidyr Package
Another popular alternative for splitting data is based on the tidyr package. The tidyr package is part of the Tidyverse, and hence part of a family of very popular packages such as dplyr, ggplot2, or the purrr package. Let’s install and load the tidyr package to R:
Now, we can use the following R syntax to split our variable into two columns:
data %>% separate(x, c("col1", "col2"), "-") # col1 col2 # 1 a 1 # 2 b 2 # 3 c 3
Video & Further Resources
If you need further explanations on the R programming codes of this article, you might have a look at the following video of my YouTube channel. In the video, I illustrate the content of this tutorial:
Additionally, you could have a look at the related tutorials of my website. I have released several other tutorials about the manipulation of data already:
- rbind Function in R
- str_split & str_split_fixed Functions in R
- stringr Package in R
- The strsplit Function in R
- split & unsplit Functions in R
- The do.call R Function
- R Functions List (+ Examples)
- The R Programming Language
In this article you learned how to separate a character column into multiple columns using R programming. Please let me know in the comments, if you have any additional questions.