R Error in read.table: duplicate ‘row.names’ are not allowed (3 Examples)

 

This article explains how to deal with the error message “duplicate ‘row.names’ are not allowed” in R programming.

The article is structured as follows:

Let’s dive right in!

 

Example 1: Reproducing the Error: duplicate ‘row.names’ are not allowed

In Example 1, I’ll show how to replicate the R programming error message in read.table “duplicate ‘row.names’ are not allowed”.

Let’s assume that we want to read the following CSV file to RStudio:

 

example csv

 

Then, we might try to use the read.csv function as shown below:

data <- read.csv("C:/Users/Joach/Desktop/my directory/my_data.csv")           # Try to read file
# Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
#   duplicate 'row.names' are not allowed

Unfortunately, the read.csv function returns the error message “duplicate ‘row.names’ are not allowed” to the RStudio console.

The reason for this is that our csv file contains commas (i.e. ,) at the end of each row, but the header does not contain such a comma.

For that reason, the R programming language assumes that the first value of each row is a row name. Since some of these values are the same (i.e. 1), the error message is returned.

In the following examples, I’ll explain different ways how to solve this problem. So keep on reading!

 

Example 2: Fixing the Error by Modifying the CSV File

In this example, I’ll show how to fix the error message “duplicate ‘row.names’ are not allowed” by manually modifying the input CSV file.

If we want to avoid this error message, we can simply remove the comma at the end of each row. Our modified CSV file should look as follows:

 

example csv 2

 

Now, we can read this modified CSV file using the read.csv function:

data <- read.csv("C:/Users/Joach/Desktop/my directory/my_data_modified.csv")  # Read modified file
data                                                                          # Print data
#   x1 x2 x3
# 1  1  5  2
# 2  1  2  7
# 3  2  5  7
# 4  1  3  1

Looks good!

 

Example 3: Fixing the Error Using row.names Argument

In the previous example, I’ve shown how to manually fix our CSV file so that we can load it using read.csv. However, depending on the size of your file and depending on your personal preferences you may want to solve this issue directly in R.

The following code shows how to import and properly manipulate our CSV file in R.

First, we can avoid the error message by setting the row.names argument to be equal to NULL:

data <- read.csv("C:/Users/Joach/Desktop/my directory/my_data.csv",           # Use row.names argument
                 row.names = NULL)
data                                                                          # Print data
#   row.names x1 x2 x3
# 1         1  5  2 NA
# 2         1  2  7 NA
# 3         2  5  7 NA
# 4         1  3  1 NA

As you can see, the previous R code created a data frame with a variable called row.names and a variable that contains only NA values. In other words: the column names are falsely shifted once to the right.

To solve this problem, we first have to change the column names of our data frame…

colnames(data) <- colnames(data)[2:ncol(data)]                                # Modify column names
data                                                                          # Print data
#   x1 x2 x3 NA
# 1  1  5  2 NA
# 2  1  2  7 NA
# 3  2  5  7 NA
# 4  1  3  1 NA

…and then we have to remove the empty column that only consists of missing values:

data <- data[ , - ncol(data)]                                                 # Remove empty column
data                                                                          # Print data
#   x1 x2 x3
# 1  1  5  2
# 2  1  2  7
# 3  2  5  7
# 4  1  3  1

As you can see, the final data frame looks exactly the same as the data frame that we have imported in Example 2. However, if you are using this method please make sure that your data is formatted properly at the end of the data manipulation process. It might easily happen that some variables are named differently as they should be.

 

Video, Further Resources & Summary

Do you need more information on the content of this post? Then you might watch the following video of my YouTube channel. In the video, I show the R programming codes of this tutorial in a live session.

 

The YouTube video will be added soon.

 

Additionally, you may read the related tutorials of my website:

 

At this point you should have learned how to solve the error “duplicate ‘row.names’ are not allowed” in the R programming language. In case you have additional questions, don’t hesitate to tell me about it in the comments.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top