Import & Merge Multiple csv Files in R (Example)

 

In this article, I’ll show you how to import and merge csv files in the R programming language.

The page will contain the following topics:

Let’s do this!

 

Exemplifying Data

Before we can start with the example, we need to create an exemplifying directory including multiple csv files. First, we need to create several data frames

data1 <- data.frame(id = 1:6,                                     # Create first example data frame
                    x1 = c(5, 1, 4, 9, 1, 2),
                    x2 = c("A", "Y", "G", "F", "G", "Y"))
 
data2 <- data.frame(id = 4:9,                                     # Create second example data frame
                    y1 = c(3, 3, 4, 1, 2, 9),
                    y2 = c("a", "x", "a", "x", "a", "x"))
 
data3 <- data.frame(id = 5:6,                                     # Create third example data frame
                    z1 = c(3, 2),
                    z2 = c("K", "b"))

…and then we need to export these data frames as csv files to our computer:

write.csv(data1, "C:/...Your Path.../my_folder/data1.csv")        # Write first example data frame
write.csv(data2, "C:/...Your Path.../my_folder/data2.csv")        # Write second example data frame
write.csv(data3, "C:/...Your Path.../my_folder/data3.csv")        # Write third example data frame

 

example directory with csv files

Figure 1: Exemplifying Directory with csv Files.

 

Figure 1 illustrates how our example directory looks like. Now let’s import and combine these data sets in RStudio…

 

Import & Load csv Files in R

We need three R add-on packages for the following R syntax: dplyr, plyr, and readr. Let’s install and load the packages to R.

install.packages("dplyr")                                         # Install dplyr package
install.packages("plyr")                                          # Install plyr package
install.packages("readr")                                         # Install readr package
 
library("dplyr")                                                  # Load dplyr package
library("plyr")                                                   # Load plyr package
library("readr")                                                  # Load readr package

Now, we can import and merge the example csv files as follows:

data_all <- list.files(path = "C:/...Your Path.../my_folder",     # Identify all csv files in folder
                       pattern = "*.csv", full.names = TRUE) %>% 
  lapply(read_csv) %>%                                            # Store all files in list
  bind_rows                                                       # Combine data sets into one data set 
data_all                                                          # Print data to RStudio console

 

tibble with merged data files r

Table 1: Tibble Containing Three Data Sets.

 

Table 1 shows the output of the previous R code. As you can see, our three data sets were combined in a single data set. Data set cells were set to NA, in case a variable was not included in all data sets.

Note that our previous R syntax created a tibble instead of a data frame. In case you prefer to work with data frames, you could simply convert this tibble to a data frame as follows:

as.data.frame(data_all)                                           # Convert tibble to data.frame

 

Video & Further Resources

Have a look at the following video that I have recently released on my Statistics Globe YouTube channel. I illustrate the contents of this article in more detail in the video.

 

The YouTube video will be added soon.

 

Please also note that the previous R code merged our data files by just column binding the different data sources to each other. However, it would also be possible to merge our data sets by a shared column name in order to avoid duplicated observations (i.e. our data sets contain an ID column). You could learn more about this type of merging HERE.

In addition, I can recommend to read the other articles on this homepage.

 

On this page I showed you how to combine all csv files in a folder in the R programming language. Let me know in the comments section, in case you have further questions.

 



Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top