Numbering Rows within Groups of Data Frame in R (2 Examples)
This article illustrates how to add a counter to each group of a data frame in the R programming language.
Table of contents:
- Constructing Example Data
- Example 1: Numbering Rows of Data Frame Groups with Base R
- Example 2: Numbering Rows of Data Frame Groups with dplyr Package
- Video & Further Resources
Let’s start right away…
Constructing Example Data
In the examples of this R tutorial, we’ll use the following example data frame:
data <- data.frame(x = 1:10, # Create example data group = c(rep("g1", 3), rep("g2", 5), rep("g3", 2))) data # Print example data # x group # 1 1 g1 # 2 2 g1 # 3 3 g1 # 4 4 g2 # 5 5 g2 # 6 6 g2 # 7 7 g2 # 8 8 g2 # 9 9 g3 # 10 10 g3
As you can see based on the output of the RStudio console, our example data contains ten rows and two columns. The first column is numeric and the second column contains a factorial grouping variable.
Example 1: Numbering Rows of Data Frame Groups with Base R
If we want to create a column containing a counter for each group of our data frame, we can use the ave function as shown in the following R code:
data1 <- data # Replicate example data data1$numbering <- ave(data1$x, # Create numbering variable data1$group, FUN = seq_along) data1 # Print updated data # x group numbering # 1 1 g1 1 # 2 2 g1 2 # 3 3 g1 3 # 4 4 g2 1 # 5 5 g2 2 # 6 6 g2 3 # 7 7 g2 4 # 8 8 g2 5 # 9 9 g3 1 # 10 10 g3 2
The updated data frame consists of the same columns as our example data plus a variable numbering each group.
Example 2: Numbering Rows of Data Frame Groups with dplyr Package
Example 2 shows how to add a numbering sequence variable with the dplyr package in R. First, we need to install and load the package:
install.packages("dplyr") # Install and load dplyr library("dplyr")
Now, we can use the group_by and the mutate functions to create a counter within each group:
data2 <- data # Replicate example data data2 <- data2 %>% # Create numbering variable group_by(group) %>% mutate(numbering = row_number()) data2 # Print updated data # # A tibble: 10 x 3 # # Groups: group [3] # x group numbering # <int> <fct> <int> # 1 1 g1 1 # 2 2 g1 2 # 3 3 g1 3 # 4 4 g2 1 # 5 5 g2 2 # 6 6 g2 3 # 7 7 g2 4 # 8 8 g2 5 # 9 9 g3 1 # 10 10 g3 2
The output is the same as in Example 1, but the data is stored as tibble.
Video & Further Resources
Have a look at the following video of my YouTube channel. In the video instruction, I illustrate the content of this article:
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
In addition, you may want to have a look at some of the related tutorials on this website:
To summarize: In this tutorial, I showed how to generate a sequential counter ID within each group of a data matrix in R programming. Don’t hesitate to tell me about it in the comments below, in case you have additional questions.
Statistics Globe Newsletter
2 Comments. Leave new
Joachim, Thank you for this.
When I tried the example 2 code on my data, however, I got the error
Error: `n()` must only be used inside dplyr verbs.
Putting ‘dplyr::’ in front of the function calls solved this to make sure they came from the right package (okay, bit of a scatter-gun solution but it did seem to work…).
Hey Ashley,
Thank you very much for the hint! This is a common problem when loading multiple packages containing functions with the same function names.
Have a look here for more details on your specific error. Putting dplyr in front of the function names is probably the easiest fix for this.
Regards,
Joachim