Read All Files in Directory & Apply Function to Each Data Frame in R (Example)

 

In this tutorial you’ll learn how to import all data frames from a folder and apply a function to each of them in R programming.

Table of contents:

With that, let’s do this:

 

Example Data

Let’s first construct some example data:

data1 <- data.frame(x1 = 1:5,                           # Create data frames
                    x2 = 3:7,
                    x3 = 2)
data2 <- data.frame(y1 = c(6, 9, 2),
                    y2 = c(4, 2, 3))
data3 <- data.frame(z1 = c(3, 6, 7, 2),
                    z2 = c(1, 5, 2, 3),
                    z3 = 4:1)

The previous R code shows the structure of our example data frames: We’ve created three data frames with different numeric columns.

Now, let’s define the path of our directory…

my_path <- "C:/Users/... Your Path .../"
setwd(my_path)                                          # Set path to folder

…and let’s write our example data frames to this directory:

write.csv2(data1, "data1.csv", row.names = FALSE)       # Write data frames
write.csv2(data2, "data2.csv", row.names = FALSE)
write.csv2(data3, "data3.csv", row.names = FALSE)

Have a look at the directory that you have specified before. It should contain three data frames stored in csv files.

 

Example: Import Data Frames from Directory & Apply Function

The following R programming syntax illustrates how to open the data frames stored in our directory. First, we need to extract the file names of all csv files:

data_frame_names <- list.files(pattern = "*.csv")       # Get all file names
data_frame_names                                        # Return file names to console
# "data1.csv" "data2.csv" "data3.csv"

As you can see based on the previous output of the RStudio console, our folder contains the csv files data1, data2, and data3. Let’s read these three data frames to R:

data_frame_list <- lapply(data_frame_names, read.csv2)  # Read all data frames
data_frame_list                                         # List containing all data frames
# [[1]]
#   x1 x2 x3
# 1  1  3  2
# 2  2  4  2
# 3  3  5  2
# 4  4  6  2
# 5  5  7  2
# 
# [[2]]
#   y1 y2
# 1  6  4
# 2  9  2
# 3  2  3
# 
# [[3]]
#   z1 z2 z3
# 1  3  1  4
# 2  6  5  3
# 3  7  2  2
# 4  2  3  1

As you can see, we have stored our three data frames in a list. Now, we can use the lapply function to apply a function to each of our data frames. In this example, we’ll compute the rowSums of each data matrix:

data_frame_output <- lapply(data_frame_list, rowSums)   # Applying function to all data
data_frame_output                                       # Return result
# [[1]]
# [1]  6  8 10 12 14
# 
# [[2]]
# [1] 10 11  5
# 
# [[3]]
# [1]  8 14 11  6

The previous output of the RStudio console shows our final result.

 

Video & Further Resources

If you need further info on the R codes of this tutorial, you may want to have a look at the following video of my YouTube channel. I’m illustrating the R syntax of this tutorial in the video:

 

The YouTube video will be added soon.

 

Furthermore, you might have a look at the other articles of my homepage.

 

In summary: This tutorial explained how to read data files from a computer in R programming. In case you have additional questions or comments, don’t hesitate to let me know in the comments section.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


2 Comments. Leave new

  • I’m having difficulty with the step which combines all of the dataframes into one list. The dataframes in that list end up having all of the columns combined into one and the rows are very long. One of your example frames would appear like this in a 3×1:
    [[2]]
    y1y2
    1 6 4
    2 9 2
    3 2 3
    The same datasheets in my files are split up correctly into multiple columns. The dataframes in R have their columns separated before using the code in this article to lapply them into a list. Any idea as to why this is happening? I changed variable names but otherwise what I have is identical to what this article says should work

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top