Reshape Data Frame from Wide to Long Format in R (2 Examples)

 

In this tutorial, I’ll illustrate how to convert a data frame from wide to long format in the R programming language.

The post contains the following topics:

You’re here for the answer, so let’s get straight to the examples:

 

Example Data

We’ll use the following data frame in the examples of this R tutorial:

data_wide <- data.frame(ID1 = c("A", "A", "B", "C", "B"),     # Create example data
                        ID2 = c("b", "c", "c", "a", "d"),
                        x = 1:5,
                        y = 6:10)
data_wide                                                     # Print example data
#   ID1 ID2 x  y
# 1   A   b 1  6
# 2   A   c 2  7
# 3   B   c 3  8
# 4   C   a 4  9
# 5   B   d 5 10

As you can see based on the previous output of the RStudio console, our example data matrix consists of five rows and four columns. Two of the columns contain IDs and two of the columns contain variables with actual values.

Furthermore, our example data frame is in wide format. In the following sections of this article you’ll learn how to reshape such a data frame to long format.

 

Example 1: Reshape Data Frame with melt Function (reshape2 Package)

Example 1 illustrates how to convert a data frame from wide to long structure with the melt function of the reshape2 package. Let’s install and load the package to R:

install.packages("reshape2")                                  # Install reshape2
library("reshape2")                                           # Load reshape2

Now, we can use the melt function of the reshape2 package to convert our data matrix to long format:

data_long1 <- melt(data_wide,                                 # Apply melt function
                  id.vars = c("ID1", "ID2"))
data_long1                                                    # Print long data
#    ID1 ID2 variable value
# 1    A   b        x     1
# 2    A   c        x     2
# 3    B   c        x     3
# 4    C   a        x     4
# 5    B   d        x     5
# 6    A   b        y     6
# 7    A   c        y     7
# 8    B   c        y     8
# 9    C   a        y     9
# 10   B   d        y    10

The previous output shows the result: A melted data frame with ten rows and four columns.

 

Example 2: Reshape Data Frame with pivot_longer Function (tidyr Package)

Another alternative for the reshaping of data from wide to long format is provided by the tidyr package:

install.packages("tidyr")                                     # Install tidyr
library("tidyr")                                              # Load tidyr

The tidyr package contains the pivot_longer function, which can be used to reshape a data frame as shown below:

data_long2 <- data_wide %>%                                   # Apply pivot_longer function
  pivot_longer(c("x", "y"))
data_long2                                                    # Print long data
# # A tibble: 10 x 4
#    ID1   ID2   name  value
#    <chr> <chr> <chr> <int>
#  1 A     b     x         1
#  2 A     b     y         6
#  3 A     c     x         2
#  4 A     c     y         7
#  5 B     c     x         3
#  6 B     c     y         8
#  7 C     a     x         4
#  8 C     a     y         9
#  9 B     d     x         5
# 10 B     d     y        10

Note that the pivot_longer function returns a tibble instead of a data frame. In case you prefer to work with data frames you have to convert this tibble back to the data.frame class.

Example 3: Reshape Data Frame with gather Function (tidyr Package)

The tidyr package also contains the gather function. The gather function is outdated and the developers of tidyr recommend using the pivot_longer function.

However, for the sake of completeness I also want to show an example that uses the gather function.

Have a look at the following R code:

data_long3 <- data_wide %>%                                   # Apply gather function
  gather(variable, value, - c(ID1, ID2))
data_long3                                                    # Print long data
#    ID1 ID2 variable value
# 1    A   b        x     1
# 2    A   c        x     2
# 3    B   c        x     3
# 4    C   a        x     4
# 5    B   d        x     5
# 6    A   b        y     6
# 7    A   c        y     7
# 8    B   c        y     8
# 9    C   a        y     9
# 10   B   d        y    10

The output of the previous R code is exactly the same as in Example 1 (i.e. a long data frame). Whether you prefer the reshape2 package or the tidyr package is a matter of taste.

 

Video, Further Resources & Summary

Do you need further information on the examples of the present article? Then you could watch the following video of my YouTube channel. I illustrate the R code of this page in the video:

 

 

Furthermore, you may want to have a look at some of the related articles which I have published on this website:

 

At this point you should have learned how to stack variables on top of each other to change the data format from wide to long efficiently in the R programming language. Don’t hesitate to let me know in the comments, in case you have additional questions.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


4 Comments. Leave new

  • Hi

    I am a fan of your YouTube videos about how to use R software.

    Dp you know how to use the reshape 2 package in R to convert this short format of information: “459.0 1033.0
    776.0 1023.0
    459.0 1013.0
    491.0 1035.0
    457.0 1066.0”
    to this long format “459.0, 1033.0, 776.0, 023.0, 459.0,1013.0, 491.0, 1035.0, 457.0, 1066.0, 459.0, 1033.0, 776.0 1023.0, 459.0, 1013.0, 491.0 1035.0, 457.0, 1066.0”
    The difference is that the short format has two rows without commas, while the long format has one long line of numbers in a column with commas. I could do it manually, but I have heaps of data and I am trying to find a way to use reshape 2 or some other package in R to automate this task to save time. Could you please help by informing me of your wisdom about R?

    Reply
  • Hi Cansu

    Yes, that is correct. I am trying to find a way to automate the data entry format from how the .tem formate, which means data that looks like this:
    “459.0 1033.0
    776.0 1023.0
    459.0 1013.0
    491.0 1035.0
    457.0 1066.0”
    Into the row/column TXT formate, which looks like this:
    “459.0, 1033.0, 776.0, 023.0, 459.0,1013.0, 491.0, 1035.0, 457.0, 1066.0, 459.0, 1033.0, 776.0 1023.0, 459.0, 1013.0, 491.0 1035.0, 457.0, 1066.0”

    I have a lot of data, so I am trying to automate that process with R software to make it faster. You could please help inform me how to do that with R software?

    Yes, I have data in two rows and want to put it in one column. However, is also important to remove the spaces and replace the spaces with commas or tab stops.

    Kind regards Thomas

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top