Reshape Data Frame from Wide to Long Format in R (2 Examples)
In this tutorial, I’ll illustrate how to convert a data frame from wide to long format in the R programming language.
The post contains the following topics:
You’re here for the answer, so let’s get straight to the examples:
We’ll use the following data frame in the examples of this R tutorial:
data_wide <- data.frame(ID1 = c("A", "A", "B", "C", "B"), # Create example data ID2 = c("b", "c", "c", "a", "d"), x = 1:5, y = 6:10) data_wide # Print example data # ID1 ID2 x y # 1 A b 1 6 # 2 A c 2 7 # 3 B c 3 8 # 4 C a 4 9 # 5 B d 5 10
As you can see based on the previous output of the RStudio console, our example data matrix consists of five rows and four columns. Two of the columns contain IDs and two of the columns contain variables with actual values.
Furthermore, our example data frame is in wide format. In the following sections of this article you’ll learn how to reshape such a data frame to long format.
Example 1: Reshape Data Frame with melt Function (reshape2 Package)
Example 1 illustrates how to convert a data frame from wide to long structure with the melt function of the reshape2 package. Let’s install and load the package to R:
install.packages("reshape2") # Install reshape2 library("reshape2") # Load reshape2
Now, we can use the melt function of the reshape2 package to convert our data matrix to long format:
data_long1 <- melt(data_wide, # Apply melt function id.vars = c("ID1", "ID2")) data_long1 # Print long data # ID1 ID2 variable value # 1 A b x 1 # 2 A c x 2 # 3 B c x 3 # 4 C a x 4 # 5 B d x 5 # 6 A b y 6 # 7 A c y 7 # 8 B c y 8 # 9 C a y 9 # 10 B d y 10
The previous output shows the result: A melted data frame with ten rows and four columns.
Example 2: Reshape Data Frame with pivot_longer Function (tidyr Package)
Another alternative for the reshaping of data from wide to long format is provided by the tidyr package:
install.packages("tidyr") # Install tidyr library("tidyr") # Load tidyr
The tidyr package contains the pivot_longer function, which can be used to reshape a data frame as shown below:
data_long2 <- data_wide %>% # Apply pivot_longer function pivot_longer(c("x", "y")) data_long2 # Print long data # # A tibble: 10 x 4 # ID1 ID2 name value # <chr> <chr> <chr> <int> # 1 A b x 1 # 2 A b y 6 # 3 A c x 2 # 4 A c y 7 # 5 B c x 3 # 6 B c y 8 # 7 C a x 4 # 8 C a y 9 # 9 B d x 5 # 10 B d y 10
Note that the pivot_longer function returns a tibble instead of a data frame. In case you prefer to work with data frames you have to convert this tibble back to the data.frame class.
Example 3: Reshape Data Frame with gather Function (tidyr Package)
The tidyr package also contains the gather function. The gather function is outdated and the developers of tidyr recommend using the pivot_longer function.
However, for the sake of completeness I also want to show an example that uses the gather function.
Have a look at the following R code:
data_long3 <- data_wide %>% # Apply gather function gather(variable, value, - c(ID1, ID2)) data_long3 # Print long data # ID1 ID2 variable value # 1 A b x 1 # 2 A c x 2 # 3 B c x 3 # 4 C a x 4 # 5 B d x 5 # 6 A b y 6 # 7 A c y 7 # 8 B c y 8 # 9 C a y 9 # 10 B d y 10
The output of the previous R code is exactly the same as in Example 1 (i.e. a long data frame). Whether you prefer the reshape2 package or the tidyr package is a matter of taste.
Video, Further Resources & Summary
Do you need further information on the examples of the present article? Then you could watch the following video of my YouTube channel. I illustrate the R code of this page in the video:
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Furthermore, you may want to have a look at some of the related articles which I have published on this website:
At this point you should have learned how to stack variables on top of each other to change the data format from wide to long efficiently in the R programming language. Don’t hesitate to let me know in the comments, in case you have additional questions.
Statistics Globe Newsletter