How to Draw a plotly Barplot in R (Example)

 

This article provides several examples of barplots in plotly.

 

 

Kirby White Researcher Statistician Programmer

Note: This article was created in collaboration with Kirby White. Kirby is an organizational effectiveness consultant and researcher, who is currently pursuing a Ph.D. at the Seattle Pacific University. You can read more about Kirby here!

 

Overview

Barplots are one of the most fundamental chart types. They are typically used to represent how a numerical property varies across categorical groups.

For instance, we could use a barplot to show how average math scores (the numeric value) differ between the USA, Canada, and Mexico (the categorical groups).

Typically, one bar is created for each category and the length of the bars are proportional to their numeric values. Usually, taller bars represent larger values.

If you have not already done so, install and load the plotly package using this code: install.packages("plotly") and library(plotly). Some functions and the data used in this tutorial also come from the tidyverse, which you can install with install.packages("tidyverse").

 

Example Data

We’ll use the diamonds dataset for this example, which is pre-loaded in the ggplot2 package (a part of the tidyverse).

diamonds contains raw data values, but we often use barplots to display summary/aggregated values.

Let’s create three pre-processed data frames, which we’ll use for the rest of the tutorial:

df <- 
  ggplot2::diamonds %>% 
  group_by(cut) %>% 
  summarize(price = mean(price),
            table = mean(table),
            depth = mean(depth))

We’ll also create a “long” format of the identical data, which can make it easier to plot some variations of the data:

df_long <- df %>% 
  pivot_longer(cols = 2:4)

Finally, we’ll create a simple table counting how many diamonds are in each combination of cut and clarity, which we’ll use later.

df_count <- count(ggplot2::diamonds, cut, clarity)

 

Basic Barplot

Let’s create a simple barplot of the average price for each cut of diamond:

plot_ly(
  data = df,
  x = ~cut,
  y = ~price,
  type = "bar"
)


This makes it very easy to see that premium diamonds tend to have the highest price! Surprising, given that “ideal” cuts are a higher grade.

 

Grouped Barplot

How can we compare each species on multiple values? This is where grouped bar charts are very useful, as they display multiple numeric values for each group on the same chart.

We can do this by adding a trace (i.e., more data to the graph) and mapping the y-values to another column. When we use this technique, it’s useful to also add the name argument so that the legend is clear.

This code builds a grouped bar chart to compare the table and depth of each diamond within each cut:

plot_ly(
  data = df,
  x = ~cut,
  y = ~depth,
  type = "bar",
  name = "Depth"
) %>% 
  add_trace(y=~table, name = "Table")


That type of chart makes it easiest to compare the different values within the same cut, but not as easy to compare the same value across different cuts. We can rearrange the chart to optimize a different set of comparisons by grouping the same numeric values together so it’s easier to compare different cuts.

This is where the “long” data format becomes easier to work with, and we’ll ignore the “price” values:

plot_ly(
  data = df_long %>% filter(name != "price"),
  x = ~name,
  y = ~value,
  color = ~cut,
  type = "bar"
)


 

Stacked Barplot

Another variation of the grouped bar chart is to “stack” the groups together. This is most often done to compare cumulative values of multiple subgroups within each group, such as cut and clarity. In this case, we want to use the df_count data frame in order to see how often each clarity grade is processed into each grade of cut.

To stack the bars, we can introduce the layout options and set barmode = "stack":

plot_ly(
  data = df_count,
  x = ~clarity,
  y = ~n,
  color = ~cut,
  type = "bar"
) %>% 
  layout(barmode = "stack")


This stacks the bars, but the length of each bar is still proportional to the size of each group. To better compare the proportions across groups, we can build a “100% stacked” bar chart to more easily see these comparisons. We can do this by plotting a proportions table, as shown here:

df_count %>%
  group_by(clarity) %>%
  mutate(prop = n/sum(n)) %>%
  plot_ly(x = ~clarity,
          y = ~prop,
          color = ~cut,
          type = "bar") %>%
  layout(barmode = "stack")


 

Other Modifications

To rotate a bar chart so the bars run horizontally instead of vertically by changing the fields mapped to x and y. In plotly, the x parameter always describes the information mapped to the horizontal axis, while the y parameter maps to the vertical axis.

plot_ly(
  data = df,
  y = ~cut,
  x = ~price,
  type = "bar"
)


Finally, you can overlay the actual value of each bar by mapping the value to the text argument and then setting the textposition to either "inside", "outside","auto", or "none".

In this example, I’ve rounded the price and added a US Dollar sign in the text field.

plot_ly(
  data = df,
  x = ~cut,
  y = ~price,
  type = "bar",
  text = ~paste0("$",round(price,2)),
  textposition = "outside"
)


 

Video, Further Resources & Summary

Check out this video for a tutorial of building these barplots in plotly:

 

 

You can check out these other articles for more detailed examples of these popular charts in plotly:

 

In this tutorial, you have learned how to draw an interactive bargraph using the plotly package in R. In case you have further questions on this topic or other types of graphics, please let us know in the comments below.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top