Add Significance Level & Stars to Plot in R (Example) | ggsignif Package
Table of contents:
Here’s how to do it:
Example 1: Draw Boxplot with Significance Stars
The following R code illustrates how to create a box-and-whisker plot with significance levels in the R programming language.
For this, we first have to create an exemplifying data set:
set.seed(567445) # Create example data for boxplot data_box <- data.frame(group = rep(LETTERS[1:4], each = 100), value = c(rnorm(100), rnorm(100, 3), rnorm(100), rnorm(100, - 5))) head(data_box) # Print head of example data
By running the previous R programming code, we have managed to create Table 1, i.e. a data frame containing 400 rows and the two columns called group and value.
In this tutorial, we will draw our data using the ggplot2 package. In order to use the functions of the ggplot2 package, we first need to install and load ggplot2.
install.packages("ggplot2") # Install ggplot2 package library("ggplot2") # Load ggplot2 package
In the next step, we can draw a boxplot without significance levels using the code below:
ggp_box <- ggplot(data_box, # Create ggplot2 boxplot aes(x = group, y = value)) + geom_boxplot() ggp_box # Draw ggplot2 boxplot
In Figure 1 you can see that we have plotted a boxplot showing the four groups in our example data in separate boxes.
Let’s assume that we want to test whether the different boxplots (i.e. the different groups in our data) are significantly different. Furthermore, let’s assume that we would like to add the significance levels to our graphic.
Then, we first have to install and load the ggsignif package.
install.packages("ggsignif") # Install ggsignif package library("ggsignif") # Load ggsignif package
To accomplish this, we can apply the geom_signif function (or alternatively the geom_stat function) as shown below. Within the geom_signif function, we have to specify the groups that we want to compare.
Let’s do this:
ggp_box + # Add p-value to plot geom_signif(comparisons = list(c("A", "B")))
By executing the previous syntax, we have managed to create Figure 2, i.e. a ggplot2 boxplot with a significance level that compares the groups A and B.
In the previous plot, we have used the p-value to compare our groups. However, we may show significance stars instead by setting the map_signif_level argument to TRUE:
ggp_box + # Add significance stars to plot geom_signif(comparisons = list(c("A", "B")), map_signif_level = TRUE)
Figure 3 shows the output of the previous syntax – i.e. our boxplot with significance stars.
It is also possible to compare multiple groups at the same time. For this, we have to extend our comparison list, and to avoid a visual overlap we should also specify the y-axis positions where we want to show the significance stars:
ggp_box + # Comparison of multiple boxplots geom_signif(comparisons = list(c("A", "B"), c("A", "C")), map_signif_level = TRUE, y_position = c(7.5, 9))
In Figure 4 you can see that we have created another version of our boxplot where the group A is compared simultaneously with the groups B and C.
The geom_signif function also enables the user to modify the design of the significance levels. For instance, we can change the color, line size, and text size as shown below:
ggp_box + # Change design of significance levels geom_signif(comparisons = list(c("A", "B"), c("A", "C")), map_signif_level = TRUE, y_position = c(7.5, 9.5), col = 2, size = 2, textsize = 5) + ylim(- 8, 12)
As you can see in Figure 5, we have increased the size of the significance levels, and we have changed the color to red.
Example 2: Draw Barplot with Significance Stars
In Example 1, I have shown how to annotate significance levels to a boxplot. However, it’s also possible to add significance levels to other types of graphs.
Example 2 demonstrates how to use the ggsignif package to add significance comparisons to a barplot.
First, we have to create another example data set:
data_bar <- data.frame(group = rep(LETTERS[1:3], # Create example data for barplot each = 2), subgroup = letters[1:2], value = c(1, 1.1, 2, 1.2, 3.7, 4.2)) data_bar # Print example data
The output of the previous R syntax is visualized in Table 2: A data frame containing six rows and the variables group, subgroup, and value.
We can draw a grouped barchart of these data using the R syntax below:
ggp_bar <- ggplot(data_bar, # Create ggplot2 barplot aes(x = group, y = value)) + geom_bar(stat = "identity", position = "dodge", aes(fill = subgroup)) ggp_bar # Draw ggplot2 barplot
As revealed in Figure 6, the previous R syntax has created a grouped ggplot2 barchart.
Next, we can use the geom_signif function to annotate comparisons of the main groups to this graph:
ggp_bar + # Add significance stars to barplot geom_signif(comparisons = list(c("A", "B")), map_signif_level = TRUE)
As shown in Figure 7, the previous R programming syntax has plotted a comparison between the main groups A and B on top of the previously created grouped barchart.
As you can see, this significance level indicates that the two groups are not significantly different. However, this non-significance might be due to the way how we have formatted our data.
Let’s assume that we know from a previously conducted data analysis that these two groups are, in fact, significantly different. Then, we might use the annotations argument within the geom_signif function to overwrite the default significance level:
ggp_bar + # Change significance levels manually geom_signif(comparisons = list(c("A", "B")), map_signif_level = TRUE, annotations = c("***"))
After executing the previously shown R code the barplot with significance stars shown in Figure 8 has been plotted.
So far, we have only compared the main groups in our barchart. However, it’s also possible to add significance comparisons for the subgroups.
To achieve this, we can apply the geom_signif function twice – once for the main groups, and once for the subgroups.
Consider the R code below:
ggp_bar + # Add significance levels for grouped barplot geom_signif(comparisons = list(c("A", "B")), map_signif_level = TRUE, annotations = c("***")) + geom_signif(stat = "identity", data = data.frame(x = c(0.7, 1.7, 2.7), xend = c(1.3, 2.3, 3.3), y = c(1.5, 2.5, 4.5), annotation = c("NS.", "***", "*")), aes(x = x, xend = xend, y = y, yend = y, annotation = annotation))
After running the previous R code the barplot you can see in Figure 9 has been created. We have drawn a comparison for the main groups A and B, as well as for all the subgroups.
Looks great, if you ask me! 🙂
Video, Further Resources & Summary
If you are interested in data visualization in R and the functions of the ggplot2 package, you have to watch the following video, where I explain the ggplot2 package in much more detail (beginners & advanced users).
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
In addition, you could have a look at some of the related tutorials which I have published on my homepage.
- Add Number of Observations by Group to ggplot2 Boxplot
- Draw Boxplot with Means in R
- Draw Multiple Boxplots in One Graph Side-by-Side
- Boxplots in R
- Draw Grouped Barplot in R
- Draw Stacked Barplot in R
- Barplots in R
- R Programming Overview
To summarize: In this R post you have learned how to put stars to a ggplot2 graphic to indicate the significance level. In case you have further questions, please tell me about it in the comments section.
Statistics Globe Newsletter