Remove Duplicated Rows from Data Frame in R (Example)

 

This tutorial illustrates how to eliminate duplicated rows from a data frame in R programming.

Table of contents:

Let’s dig in…

 

Creation of Example Data

In the example of this R tutorial, we’ll use the following data frame in R:

data <- data.frame(x1 = c(1:5, 2, 5),                # Create example data
                   x2 = c(letters[1:5], "b", "e"))
data                                                 # Print example data
# x1 x2
#  1  a
#  2  b
#  3  c
#  4  d
#  5  e
#  2  b
#  5  e

The RStudio console output is illustrating the structure of our data. Our data frame consists of seven rows and two columns, whereby rows 1 and 2 are duplicated in rows 6 and 7.

 

Example: Delete Duplicated Rows from Data Frame

If we want to remove repeated rows from our example data, we can use the duplicated() R function. The duplicated function returns a logical vector, identifying duplicated rows with a TRUE or FALSE. By putting a bang (i.e. !) in front of the duplicated command, we can subset our data so that only unique rows remain:

data_unique <- data[!duplicated(data), ]             # Remove duplicated rows
data_unique                                          # Print unique data
# x1 x2
#  1  a
#  2  b
#  3  c
#  4  d
#  5  e

As you can see based on the previous output of the RStudio console, only the five unique rows were obtained in our data. Rows 6 and seven were removed.

 

Video & Further Resources

Would you like to know more about the removal of replicated rows in a data matrix? Then you might watch the following video of my YouTube channel. I show the R programming codes of this tutorial in the video:

 

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube Content Consent Button Thumbnail

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

 

In addition, you might read the other posts of my website. A selection of related articles is listed here:

 

Summary: At this point of the tutorial you should have learned how to identify and remove duplicates in the R programming language. Let me know in the comments section below, in case you have any further questions.

 



Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top