# How to Draw Histograms with plotly in Python (Example)

This article provides several examples of histograms in plotly using the Python programming language.

Note: This article was created in collaboration with Kirby White. Kirby is a Statistics Globe author, innovation consultant, data science instructor. His Ph.D. is in Industrial-Organizational Psychology. You can read more about Kirby here!

## Modules and Example Data

from vega_datasets import data
import plotly.express as px

We’ll use the stocks dataset for this example, which is included with the vega datasets. It contains the daily stock prices for several companies over a few months of time. We’ll store this in a data frame called df.

df = data.stocks()
df

# 	symbol  date        price
#	  MSFT    2000-01-01  39.81
#	  MSFT    2000-02-01  36.35
#	  MSFT    2000-03-01  43.22
#   etc...

## Basic Histogram

Let’s create a simple histogram to look at the distribution of all the stock prices in this dataset:

fig1 = px.histogram(data_frame = df,
x = "price")
fig1

This histogram makes it easy to see that the most common stock prices were between $0-$50, as indicated by the largest bar on the left. We can also see that there is a cluster of days with stock prices between $300-$600, and that no stock prices exceed \$750.

## Plotting Multiple Groups

To see how the stock prices vary by company, we can create multiple histograms and overlay them on top of each other. We set color = 'symbol' to plot each company as a different color and barmode = 'overlay':

fig2 = px.histogram(data_frame = df,
x = 'price',
color = 'symbol',
barmode = 'overlay')

fig2.show()

This is not always ideal, as it can still be difficult to distinguish between groups. An alternative method is to create a seperate histogram for each company, sometimes called facets or a ‘small multiples’ plot:

fig3 = px.histogram(data_frame = df,
x = 'price',
color = 'symbol',
facet_col= 'symbol')

fig3.show()

This makes it much more clear that the price range of the MSFT stocks are typically lower than the others, and that GOOG tends to have the highests prices.

## Bin Sizes and Alternative Graphs

Until now, we have used the default settings for the width of the bars in our histograms. However, you may change the number or size of bins according to your own needs.

In this example, I’ll show how to set the size of each bin. This lets the algorithm determine how many bins to draw. We’ll recreate our first graph but limit the number of bins to 10:

fig4 = px.histogram(data_frame = df,
x = "price",
nbins=10)

fig4.show()

A common alternative for visualizing the distribution of numeric values is to use a violin plot. These kinds of plots are basically just sideways histograms. This can be a simple way to display multiple distributions alongside each other. Consider the example graphic below:

fig5 = px.violin(data_frame = df,
x = 'symbol',
color = 'symbol',
y = 'price',)
fig5.show()

This is another way to clearly show that most MSFT prices are in a relatively small range while the GOOG prices are spread across a wide range.

Finally, another distribution visualization is the boxplot. Boxplots emphasize the differences between quartile ranges, and would look like this:

fig6 = px.box(data_frame = df,
x = 'symbol',
color = 'symbol',
y = 'price')

fig6.show()

## Further Resources

You can check out these other articles for more detailed examples and videos of these popular charts in plotly:

Subscribe to the Statistics Globe Newsletter