R Error in read.table: duplicate ‘row.names’ are not allowed (3 Examples)
This article explains how to deal with the error message “duplicate ‘row.names’ are not allowed” in R programming.
The article is structured as follows:
Let’s dive right in!
Example 1: Reproducing the Error: duplicate ‘row.names’ are not allowed
In Example 1, I’ll show how to replicate the R programming error message in read.table “duplicate ‘row.names’ are not allowed”.
Let’s assume that we want to read the following CSV file to RStudio:
Then, we might try to use the read.csv function as shown below:
data <- read.csv("C:/Users/Joach/Desktop/my directory/my_data.csv") # Try to read file # Error in read.table(file = file, header = header, sep = sep, quote = quote, : # duplicate 'row.names' are not allowed |
data <- read.csv("C:/Users/Joach/Desktop/my directory/my_data.csv") # Try to read file # Error in read.table(file = file, header = header, sep = sep, quote = quote, : # duplicate 'row.names' are not allowed
Unfortunately, the read.csv function returns the error message “duplicate ‘row.names’ are not allowed” to the RStudio console.
The reason for this is that our csv file contains commas (i.e. ,) at the end of each row, but the header does not contain such a comma.
For that reason, the R programming language assumes that the first value of each row is a row name. Since some of these values are the same (i.e. 1), the error message is returned.
In the following examples, I’ll explain different ways how to solve this problem. So keep on reading!
Example 2: Fixing the Error by Modifying the CSV File
In this example, I’ll show how to fix the error message “duplicate ‘row.names’ are not allowed” by manually modifying the input CSV file.
If we want to avoid this error message, we can simply remove the comma at the end of each row. Our modified CSV file should look as follows:
Now, we can read this modified CSV file using the read.csv function:
data <- read.csv("C:/Users/Joach/Desktop/my directory/my_data_modified.csv") # Read modified file data # Print data # x1 x2 x3 # 1 1 5 2 # 2 1 2 7 # 3 2 5 7 # 4 1 3 1 |
data <- read.csv("C:/Users/Joach/Desktop/my directory/my_data_modified.csv") # Read modified file data # Print data # x1 x2 x3 # 1 1 5 2 # 2 1 2 7 # 3 2 5 7 # 4 1 3 1
Looks good!
Example 3: Fixing the Error Using row.names Argument
In the previous example, I’ve shown how to manually fix our CSV file so that we can load it using read.csv. However, depending on the size of your file and depending on your personal preferences you may want to solve this issue directly in R.
The following code shows how to import and properly manipulate our CSV file in R.
First, we can avoid the error message by setting the row.names argument to be equal to NULL:
data <- read.csv("C:/Users/Joach/Desktop/my directory/my_data.csv", # Use row.names argument row.names = NULL) data # Print data # row.names x1 x2 x3 # 1 1 5 2 NA # 2 1 2 7 NA # 3 2 5 7 NA # 4 1 3 1 NA |
data <- read.csv("C:/Users/Joach/Desktop/my directory/my_data.csv", # Use row.names argument row.names = NULL) data # Print data # row.names x1 x2 x3 # 1 1 5 2 NA # 2 1 2 7 NA # 3 2 5 7 NA # 4 1 3 1 NA
As you can see, the previous R code created a data frame with a variable called row.names and a variable that contains only NA values. In other words: the column names are falsely shifted once to the right.
To solve this problem, we first have to change the column names of our data frame…
colnames(data) <- colnames(data)[2:ncol(data)] # Modify column names data # Print data # x1 x2 x3 NA # 1 1 5 2 NA # 2 1 2 7 NA # 3 2 5 7 NA # 4 1 3 1 NA |
colnames(data) <- colnames(data)[2:ncol(data)] # Modify column names data # Print data # x1 x2 x3 NA # 1 1 5 2 NA # 2 1 2 7 NA # 3 2 5 7 NA # 4 1 3 1 NA
…and then we have to remove the empty column that only consists of missing values:
data <- data[ , - ncol(data)] # Remove empty column data # Print data # x1 x2 x3 # 1 1 5 2 # 2 1 2 7 # 3 2 5 7 # 4 1 3 1 |
data <- data[ , - ncol(data)] # Remove empty column data # Print data # x1 x2 x3 # 1 1 5 2 # 2 1 2 7 # 3 2 5 7 # 4 1 3 1
As you can see, the final data frame looks exactly the same as the data frame that we have imported in Example 2. However, if you are using this method please make sure that your data is formatted properly at the end of the data manipulation process. It might easily happen that some variables are named differently as they should be.
Video, Further Resources & Summary
Do you need further information on the content of this post? Then you might watch the following video of my YouTube channel. In the video, I show the R programming codes of this tutorial in a live session.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Additionally, you may read the related tutorials of my website:
- Change Row Names of Data Frame or Matrix
- Don’t Display Data Frame Row Names
- Convert Values in Column into Row Names of Data Frame
- Merge Data Frames by Row Names in R
- Solving Error & Warning Messages in R (List with Examples)
- R Programming Tutorials
At this point you should have learned how to solve the error “duplicate ‘row.names’ are not allowed” in the R programming language. In case you have additional questions, don’t hesitate to tell me about it in the comments.
2 Comments. Leave new
Thank you very much, it really works.
Hi Cristian,
that’s great to hear, glad the tutorial helped you out. Thanks for the feedback!
Regards,
Matthias