Read All Files in Directory & Apply Function to Each Data Frame in R (Example)
In this tutorial you’ll learn how to import all data frames from a folder and apply a function to each of them in R programming.
Table of contents:
With that, let’s do this:
Example Data
Let’s first construct some example data:
data1 <- data.frame(x1 = 1:5, # Create data frames x2 = 3:7, x3 = 2) data2 <- data.frame(y1 = c(6, 9, 2), y2 = c(4, 2, 3)) data3 <- data.frame(z1 = c(3, 6, 7, 2), z2 = c(1, 5, 2, 3), z3 = 4:1)
The previous R code shows the structure of our example data frames: We’ve created three data frames with different numeric columns.
Now, let’s define the path of our directory…
my_path <- "C:/Users/... Your Path .../" setwd(my_path) # Set path to folder
…and let’s write our example data frames to this directory:
write.csv2(data1, "data1.csv", row.names = FALSE) # Write data frames write.csv2(data2, "data2.csv", row.names = FALSE) write.csv2(data3, "data3.csv", row.names = FALSE)
Have a look at the directory that you have specified before. It should contain three data frames stored in csv files.
Example: Import Data Frames from Directory & Apply Function
The following R programming syntax illustrates how to open the data frames stored in our directory. First, we need to extract the file names of all csv files:
data_frame_names <- list.files(pattern = "*.csv") # Get all file names data_frame_names # Return file names to console # "data1.csv" "data2.csv" "data3.csv"
As you can see based on the previous output of the RStudio console, our folder contains the csv files data1, data2, and data3. Let’s read these three data frames to R:
data_frame_list <- lapply(data_frame_names, read.csv2) # Read all data frames data_frame_list # List containing all data frames # [[1]] # x1 x2 x3 # 1 1 3 2 # 2 2 4 2 # 3 3 5 2 # 4 4 6 2 # 5 5 7 2 # # [[2]] # y1 y2 # 1 6 4 # 2 9 2 # 3 2 3 # # [[3]] # z1 z2 z3 # 1 3 1 4 # 2 6 5 3 # 3 7 2 2 # 4 2 3 1
As you can see, we have stored our three data frames in a list. Now, we can use the lapply function to apply a function to each of our data frames. In this example, we’ll compute the rowSums of each data matrix:
data_frame_output <- lapply(data_frame_list, rowSums) # Applying function to all data data_frame_output # Return result # [[1]] # [1] 6 8 10 12 14 # # [[2]] # [1] 10 11 5 # # [[3]] # [1] 8 14 11 6
The previous output of the RStudio console shows our final result.
Video & Further Resources
If you need further info on the R codes of this tutorial, you may want to have a look at the following video of my YouTube channel. I’m illustrating the R syntax of this tutorial in the video:
The YouTube video will be added soon.
Furthermore, you might have a look at the other articles of my homepage.
- Getting & Setting Working Directories
- apply() Function in R
- List All Files with Specific Extension in R
- Import & Merge Multiple csv Files in R
- The R Programming Language
In summary: This tutorial explained how to read data files from a computer in R programming. In case you have additional questions or comments, don’t hesitate to let me know in the comments section.
2 Comments. Leave new
I’m having difficulty with the step which combines all of the dataframes into one list. The dataframes in that list end up having all of the columns combined into one and the rows are very long. One of your example frames would appear like this in a 3×1:
[[2]]
y1y2
1 6 4
2 9 2
3 2 3
The same datasheets in my files are split up correctly into multiple columns. The dataframes in R have their columns separated before using the code in this article to lapply them into a list. Any idea as to why this is happening? I changed variable names but otherwise what I have is identical to what this article says should work
Hello,
Could you please share the code that you are using?
Best,
Cansu