Numbering Rows within Groups of Data Frame in R (2 Examples)

 

This article illustrates how to add a counter to each group of a data frame in the R programming language.

Table of contents:

Let’s start right away…

 

Constructing Example Data

In the examples of this R tutorial, we’ll use the following example data frame:

data <- data.frame(x = 1:10,                    # Create example data
                   group = c(rep("g1", 3),
                             rep("g2", 5),
                             rep("g3", 2)))
data                                            # Print example data
#     x group
# 1   1    g1
# 2   2    g1
# 3   3    g1
# 4   4    g2
# 5   5    g2
# 6   6    g2
# 7   7    g2
# 8   8    g2
# 9   9    g3
# 10 10    g3

As you can see based on the output of the RStudio console, our example data contains ten rows and two columns. The first column is numeric and the second column contains a factorial grouping variable.

 

Example 1: Numbering Rows of Data Frame Groups with Base R

If we want to create a column containing a counter for each group of our data frame, we can use the ave function as shown in the following R code:

data1 <- data                                   # Replicate example data
data1$numbering <- ave(data1$x,                 # Create numbering variable
                       data1$group,
                       FUN = seq_along)
data1                                           # Print updated data
#     x group numbering
# 1   1    g1         1
# 2   2    g1         2
# 3   3    g1         3
# 4   4    g2         1
# 5   5    g2         2
# 6   6    g2         3
# 7   7    g2         4
# 8   8    g2         5
# 9   9    g3         1
# 10 10    g3         2

The updated data frame consists of the same columns as our example data plus a variable numbering each group.

 

Example 2: Numbering Rows of Data Frame Groups with dplyr Package

Example 2 shows how to add a numbering sequence variable with the dplyr package in R. First, we need to install and load the package:

install.packages("dplyr")                       # Install and load dplyr
library("dplyr")

Now, we can use the group_by and the mutate functions to create a counter within each group:

data2 <- data                                   # Replicate example data
data2 <- data2 %>%                              # Create numbering variable
  group_by(group) %>%
  mutate(numbering = row_number())
data2                                           # Print updated data
# # A tibble: 10 x 3
# # Groups:   group [3]
#        x  group numbering
#    <int>  <fct>     <int>
#  1     1  g1            1
#  2     2  g1            2
#  3     3  g1            3
#  4     4  g2            1
#  5     5  g2            2
#  6     6  g2            3
#  7     7  g2            4
#  8     8  g2            5
#  9     9  g3            1
#  10    10 g3            2

The output is the same as in Example 1, but the data is stored as tibble.

 

Video & Further Resources

Have a look at the following video of my YouTube channel. In the video instruction, I illustrate the content of this article:

 

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube Content Consent Button Thumbnail

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

 

In addition, you may want to have a look at some of the related tutorials on this website:

 

To summarize: In this tutorial, I showed how to generate a sequential counter ID within each group of a data matrix in R programming. Don’t hesitate to tell me about it in the comments below, in case you have additional questions.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


2 Comments. Leave new

  • Ashley Buckner
    March 21, 2022 9:51 am

    Joachim, Thank you for this.
    When I tried the example 2 code on my data, however, I got the error
    Error: `n()` must only be used inside dplyr verbs.
    Putting ‘dplyr::’ in front of the function calls solved this to make sure they came from the right package (okay, bit of a scatter-gun solution but it did seem to work…).

    Reply
    • Hey Ashley,

      Thank you very much for the hint! This is a common problem when loading multiple packages containing functions with the same function names.

      Have a look here for more details on your specific error. Putting dplyr in front of the function names is probably the easiest fix for this.

      Regards,
      Joachim

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top