Ignore Outliers in ggplot2 Boxplot in R (Example)

 

In this article you’ll learn how to remove outliers from ggplot2 boxplots in the R programming language.

The article will contain one examples for the removal of outliers. To be more precise, the table of content looks like this:

Let’s do this:

 

Introduction of Example Data

In this example, we’ll use the following data frame as basement:

data <- data.frame(y = c(runif(20), 5, - 3, 8))           # Create example data

Our data frame consists of one variable containing numeric values. Some of these values are outliers.

In order to draw plots with the ggplot2 package, we need to install and load the package to RStudio:

install.packages("ggplot2")                               # Install and load ggplot2
library("ggplot2")

Now, we can print a basic ggplot2 boxplot with the the ggplot() and geom_boxplot() functions:

ggplot(data, aes(y = y)) +                                # Create ggplot with outliers
  geom_boxplot()

 

ggplot2 boxplot with outliers

Figure 1: ggplot2 Boxplot with Outliers.

 

As you can see based on Figure 1, we created a ggplot2 boxplot with outliers. Now, let’s remove these outliers…

 

Example: Remove Outliers from ggplot2 Boxplot

If we want to remove outliers in R, we have to set the outlier.shape argument to be equal to NA. Furthermore, we have to specify the coord_cartesian() function so that all outliers larger or smaller as a certain quantile are excluded. Have a look at the following R programming code and the output in Figure 2:

ggplot(data, aes(y = y)) +                                # Create ggplot without outliers
  geom_boxplot(outlier.shape = NA) +
  coord_cartesian(ylim = quantile(data$y, c(0.1, 0.9)))

 

ignore outliers in ggplot2 boxplot in r

Figure 2: ggplot2 Boxplot without Outliers.

 

As you can see, we removed the outliers from our plot. Note that the y-axis limits were heavily decreased, since the outliers are not shown anymore. You may set the y-axis limits to your personal preferences as shown in this tutorial.

 

Video & Further Resources

I have recently released a video on my YouTube channel, which illustrates the examples of this article. You can find the video below:

 

 

Furthermore, I can recommend to have a look at the other articles of my homepage. Some posts about ggplot and the axis limits of plots can be found below.

 

To summarize: At this point you should know how to ignore and delete outliers in ggplot2 boxplots in the R programming language. Don’t hesitate to tell me about it in the comments section below, in case you have further questions.

 



Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top