Read All Worksheets of Excel File into List in R (Example)

 

In this tutorial, I’ll illustrate how to import all sheets of an Excel file to a list in R.

The tutorial will consist of this:

Here’s how to do it…

 

Example Data & Packages

As the first step, we have to construct an example Excel file with multiple worksheets. For this, we have to install and load the xlsx package:

install.packages("xlsx")                               # Install xlsx package
library("xlsx")                                        # Load xlsx package

Next, we can create an Excel file with three sheets using the following R syntax:

write.xlsx2(iris, "data_all.xlsx", row.names = FALSE)  # Create Excel file with multiple sheets
# [1] "Sheet1" "Sheet2" "Sheet3"
write.xlsx2(mtcars, "data_all.xlsx", row.names = FALSE, append = TRUE, sheetName = "Sheet2")
write.xlsx2(beaver1, "data_all.xlsx", row.names = FALSE, append = TRUE, sheetName = "Sheet3")

 

excel file multiple sheets

 

Have a look at the previous screenshot. It shows that the example Excel file has three different worksheets.

 

Example: Importing All Sheets of Excel File Using readxl Package

The following R syntax explains how to import all sheets of our Excel file into a list in R.

First, we have to install and load the readxl package:

install.packages("readxl")                             # Install & load readxl package
library("readxl")

Next, we can use the excel_sheets function to create a vector containing all sheet names as character strings:

sheet_names <- excel_sheets("data_all.xlsx")           # Get sheet names
sheet_names                                            # Print sheet names
# [1] "Sheet1" "Sheet2" "Sheet3"

As you can see based on the previous output of the RStudio console, our example file has three worksheets named Sheet1, Sheet2, and Sheet3.

In the next step, we can use the lapply, as.data.frame, and read_excel functions to import our data to R:

list_all <- lapply(sheet_names, function(x) {          # Read all sheets to list
  as.data.frame(read_excel("data_all.xlsx", sheet = x)) } )

The previous R code has created a new list object called list_all, which contains all worksheets of our excel file converted as data frame objects.

It makes sense to rename the list elements of our list by the sheet names of our Excel file:

names(list_all) <- sheet_names                         # Rename list elements

That’s it! Now, we can have a look at each of the elements of our list.

The first six rows of our Sheet1 data look as follows…

head(list_all$Sheet1)                                  # Print head of first sheet

 

table 1 data frame read all worksheets excel file list r

 

…the head of the second sheet looks as follows…

head(list_all$Sheet2)                                  # Print head of second sheet

 

table 2 data frame read all worksheets excel file list r

 

…and the third sheet has the following data structure:

head(list_all$Sheet3)                                  # Print head of third sheet

 

table 3 data frame read all worksheets excel file list r

 

Looks good!

 

Video, Further Resources & Summary

Some time ago I have published a video on my YouTube channel, which explains the R code of this tutorial. You can find the video below:

 

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube Content Consent Button Thumbnail

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

 

In addition, you may want to read the other tutorials of this website. Some tutorials are listed below.

 

In this article you have learned how to create a list of data frames based on an Excel file in R programming. Please tell me about it in the comments section, if you have further questions.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


13 Comments. Leave new

  • Thanks for this. If I want all the data from all three sheets in one dataframe with the Sheet names to showing as a new column how do I do that?

    Reply
  • Hi Joachim, Indeed I do. Here is the Scenario: I have 36 tabs which are named Jan20 – Dec22. I want to extract all the data in the tab into R and use the tab (sheet names) as the date column in the R data frame. There is no date field in the data apart from the tab/ sheet names.
    The data columns are the same for all tabs.

    I managed to use this code to achieve it:
    Baselevel_Expense_Tbl %
    mutate(sheet_contents =
    map(sheet, function(sheet) {
    read_excel(“file.xlsx”.xlsx”,
    sheet = sheet)})) %>%
    unnest(sheet_contents)

    However something strange happened, R shows NAs for one of the columns of for all sheets except Jan20,Jan21 and Jan22 where it pulls all the data as in the excel sheets. For example for February20 the USD column is pulling NAs but the EUR and GBP columns are pulling through the expected values.
    My email is kengjisu@duck.com if you want to reachout for a clearer explanation.
    Thanks

    Reply
  • Hi Joachim, when I upload and unnest the file I get no errors.

    Reply
    • One guess would be that the data is formatted differently in Jan20-Jan22 compared to February20.

      What happens if you only import February20? Is the data formatted properly in this case?

      Regards,
      Joachim

      Reply
  • So there are 274 columns in each sheet. USD is the 5th column in each of the sheets. The data from all columns in the Jan20, Jan21 and Jan22 sheets come through. In the other sheets, The data in the USD(5th column) is read through as NAs but all the other data on the remaining columns come through.
    The formats are all the same.

    Reply
    • Could you please try to import the February20 data set as a separate data.frame without combining it with the other data? Is the data imported properly in this case?

      Regards,
      Joachim

      Reply
  • I have deleted the Jan20, Jan21,Jan22 sheets and now Feb20 USD column is pulling through the figures but NAs are being pulled in for the other moths in the USD columns though the other currencies are coming through correctly.

    Reply
  • Hi Joachim, I think I figured it out:
    read_excel(“Month End TopRevenue PMR.xlsx”,trim_ws = TRUE,col_types = “text”,skip = 3,n_max = 1000,
    sheet = sheet)}))
    col_types =”text”. For some reason R was not recognizing the format of the USD column so returning NAs

    Reply
  • Illya Bjazevic
    July 14, 2022 4:53 pm

    Hi! I just saw your video on YouTube about How to import data from an Excel file with several sheets.

    Do you have another video about what would be the next step if you do not have the same columns in the different Excel sheets?

    Of course there should be some relation between them.

    Have you written o publish a video about that topic?

    By they way, I am a big fan! You always have fantastic material!

    We are very grateful!

    Reply
    • Hey Illya,

      Thank you so much for the very kind feedback, glad you find my videos helpful! 🙂

      Regarding your question, you may combine data sets with different column names using the bind_rows function of the dplyr package. Have a look here.

      I hope that helps!

      Joachim

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top