Split Data Frame Variable into Multiple Columns in R (3 Examples) | Separate Character String Based on Delimiter

 

In this tutorial, I’ll illustrate how to separate one variable of a data frame into multiple columns in R.

The article contains the following:

Let’s get started.

 

Creation of Example Data

In the examples of this R programming tutorial, we’ll use the following data frame as basement:

data <- data.frame(x = c("a-1", "b-2", "c-3"))
data
#     x
# 1 a-1
# 2 b-2
# 3 c-3

Our example data consists of one column. This column is a character string and each value of this column is separated with a – sign.

In the following examples, I’ll show how to split this column into multiple columns based on the delimiter “-“. So keep on reading.

 

Example 1: Split Column with Base R

The basic installation of R provides a solution for the splitting of variables based on a delimiter. If we want to split our variable with Base R, we can use a combination of the data.frame, do.call, rbind, strsplit, and as.character functions. Have a look at the following R code:

data.frame(do.call("rbind", strsplit(as.character(data$x), "-", fixed = TRUE)))
#   X1 X2
# 1  a  1
# 2  b  2
# 3  c  3

As you can see based on the output of the RStudio console, we created a new data frame consisting of two columns X1 and X2. The first column contains the values before the separator “-” and the second column contains the values at the right side of the delimiter “-“.

Looks good, but you may find the previous R code kind of complicated (I agree). Fortunately, the R programming language provides add-on packages that allow for the splitting of variables with simpler R code. You guessed it – That’s what I’m going to show you next!

 

Example 2: Split Column with stringr Package

A very simple solution for separating character strings is provided by the str_split_fixed function of stringr package. First, we need to install and load the stringr package…

install.packages("stringr")
library("stringr")

…and then we can use the str_split_fixed command to split our column into two columns:

str_split_fixed(data$x, "-", 2)
#     [,1] [,2]
# [1,] "a"  "1" 
# [2,] "b"  "2" 
# [3,] "c"  "3"

The output of the function is a matrix consisting of two columns with the same values as shown in Example 1. However, this time the R code was much cleaner.

 

Example 3: Split Column with tidyr Package

Another popular alternative for splitting data is based on the tidyr package. The tidyr package is part of the Tidyverse, and hence part of a family of very popular packages such as dplyr, ggplot2, or the purrr package. Let’s install and load the tidyr package to R:

install.packages("tidyr")
library("tidyr")

Now, we can use the following R syntax to split our variable into two columns:

data %>%
  separate(x, c("col1", "col2"), "-")
#   col1 col2
# 1    a    1
# 2    b    2
# 3    c    3

Easy peasy!

 

Video & Further Resources

If you need further explanations on the R programming codes of this article, you might have a look at the following video of my YouTube channel. In the video, I illustrate the content of this tutorial:

 

 

Additionally, you could have a look at the related tutorials of my website. I have released several other tutorials about the manipulation of data already:

 

In this article you learned how to separate a character column into multiple columns using R programming. Please let me know in the comments, if you have any additional questions.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top