R Error: `n()` must only be used inside dplyr verbs. (2 Examples)
In this article, I’ll illustrate how to reproduce and fix the error message “must only be used inside dplyr verbs” in the R programming language.
The post is structured as follows:
Let’s start right away…
Creation of Example Data
The following data is used as basement for this R programming language tutorial:
data <- data.frame(value = 1:9, # Create example data group = letters[1:3]) data # Print example data |
data <- data.frame(value = 1:9, # Create example data group = letters[1:3]) data # Print example data
Have a look at the table that got returned after executing the previous R code. It shows that our example data is constituted of nine rows and two columns.
Example 1: Reproduce the Error: `n()` must only be used inside dplyr verbs.
In Example 1, I’ll show how to replicate the error “must only be used inside dplyr verbs” in the R programming language.
In order to reproduce this error message, we first have to install and load the dplyr package:
install.packages("dplyr") # Install & load dplyr package library("dplyr") |
install.packages("dplyr") # Install & load dplyr package library("dplyr")
As next step, we have to install and load the plyr package:
install.packages("plyr") # Install & load plyr library("plyr") |
install.packages("plyr") # Install & load plyr library("plyr")
Note that the order in which we load those two packages is key for the replication of the error message “must only be used inside dplyr verbs”.
However, let’s assume that we want to count the number of cases within each group of our example data frame using the group_by, summarize, and n functions.
Then, we might try to execute the following R code:
data %>% # Code leads to error group_by(group) %>% summarize(count = n()) # Error: `n()` must only be used inside dplyr verbs. # Run `rlang::last_error()` to see where the error occurred. |
data %>% # Code leads to error group_by(group) %>% summarize(count = n()) # Error: `n()` must only be used inside dplyr verbs. # Run `rlang::last_error()` to see where the error occurred.
Unfortunately, the RStudio console returns the error message “must only be used inside dplyr verbs”.
The reason for this is that a function called summarize exists in the dplyr AND in the plyr package.
Since we have loaded the plyr package second, the R programming language by default uses the summarize function of the plyr package.
However, the R code that we have used above needs to use the summarize function of the dplyr package, leading to problems when running the group_by and summarize functions in R.
So how can we debug this error message? Keep on reading!
Example 2: Fix the Error: `n()` must only be used inside dplyr verbs.
This example illustrates how to avoid the error message “must only be used inside dplyr verbs”.
As explained before, the cause of this error is R’s confusion which summarize function (dplyr vs. plyr) it should use.
Fortunately, we can tell R explicitly the package that we want to use by specifying the name and :: in front of the function.
Consider the following R code:
data %>% # Code does not lead to error group_by(group) %>% dplyr::summarize(count = n()) # `summarise()` ungrouping output (override with `.groups` argument) # # A tibble: 3 x 2 # group count # <chr> <int> # 1 a 3 # 2 b 3 # 3 c 3 |
data %>% # Code does not lead to error group_by(group) %>% dplyr::summarize(count = n()) # `summarise()` ungrouping output (override with `.groups` argument) # # A tibble: 3 x 2 # group count # <chr> <int> # 1 a 3 # 2 b 3 # 3 c 3
The previous code works fine and runs without any error messages.
Video, Further Resources & Summary
Do you want to learn more about dplyr errors? Then you may watch the following video tutorial of my YouTube channel. I’m explaining the content of this tutorial in the video.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
In addition, you could have a look at the related R programming tutorials of my homepage.
You have learned in this article how to handle the error message “must only be used inside dplyr verbs” in the R programming language.
However, I have some final notes before we close this tutorial:
In this tutorial, we have illustrated the error message “must only be used inside dplyr verbs” based on the summarize and n functions. However, this error might also occur when using other dplyr functions that are in conflict with other packages.
Furthermore, this error message depends on the version of dplyr and the exact R code that you use. Depending on your situation, you may also see the error message “function should not be called directly”. However, the reason for this error message is also the conflict with other packages.
Tell me about it in the comments, in case you have any additional questions. Furthermore, don’t forget to subscribe to my email newsletter in order to receive updates on the newest articles.
2 Comments. Leave new
Great solution. Thanks a lot to socialize knowledge!
Hey Fabio,
Thanks a lot for the kind comment, glad you found it helpful!
Regards,
Joachim