# Overlay Histogram with Fitted Density Curve in Base R & ggplot2 Package (2 Examples)

In this tutorial you’ll learn how to **fit a density plot to a histogram** in the R programming language.

Table of contents:

- Introduction of Example Data
- Example 1: Histogram & Density with Base R
- Example 2: Histogram & Density with ggplot2 Package
- Video, Further Resources & Summary

Let’s just jump right in!

## Introduction of Example Data

In the examples of this R programming tutorial, we’ll use the following example data:

set.seed(18462) # Create example data data <- data.frame(x = round(rnorm(1000, 10, 10))) head(data) # Print example data # x # 1 6 # 2 7 # 3 14 # 4 4 # 5 -10 # 6 16 |

set.seed(18462) # Create example data data <- data.frame(x = round(rnorm(1000, 10, 10))) head(data) # Print example data # x # 1 6 # 2 7 # 3 14 # 4 4 # 5 -10 # 6 16

As you can see based on the output of the RStudio console, our example data contains only one numeric column. Now, let’s draw these data…

## Example 1: Histogram & Density with Base R

Example 1 explains how to fit a density curve to a histogram with the basic installation of the R programming language. First, we need to use the hist function to draw a histogram:

hist(data$x, prob = TRUE) # Create histogram with Base R |

hist(data$x, prob = TRUE) # Create histogram with Base R

**Figure 1: Histogram Created with Base R.**

Figure 1 shows the output of the previous R code: A histogram without a density line. If we want to add a kernel density to this graph, we can use a combination of the lines and density functions:

lines(density(data$x), col = "red") # Overlay density curve |

lines(density(data$x), col = "red") # Overlay density curve

**Figure 2: Histogram & Overlaid Density Plot Created with Base R.**

Figure 2 illustrates the final result of Example 1: A histogram with a fitted density curve created in Base R.

## Example 2: Histogram & Density with ggplot2 Package

Example 2 shows how to create a histogram with a fitted density plot based on the ggplot2 add-on package. First, we need to install and load ggplot2 to R:

install.packages("ggplot2") # Install & load ggplot2 library("ggplot2") |

install.packages("ggplot2") # Install & load ggplot2 library("ggplot2")

Now, we can use a combination of the ggplot, geom_histogram, and geom_density functions to create out graphic:

ggplot(data, aes(x)) + # ggplot2 histogram & density geom_histogram(aes(y = stat(density))) + geom_density(col = "red") |

ggplot(data, aes(x)) + # ggplot2 histogram & density geom_histogram(aes(y = stat(density))) + geom_density(col = "red")

**Figure 3: Histogram & Overlaid Density Plot Created with ggplot2 Package.**

Figure 3 visualizes our histogram and density line created with the ggplot2 package. Note that the histogram bars of Example 1 and Example 2 look slightly different, since by default the ggplot2 packages uses a different width of the bars compared to Base R.

## Video, Further Resources & Summary

Some time ago I have published a video on my YouTube channel, which illustrates the content of this tutorial. You can find the video below:

**Please accept YouTube cookies to play this video.** By accepting you will be accessing content from YouTube, a service provided by an external third party.

If you accept this notice, your choice will be saved and the page will refresh.

Furthermore, you might want to have a look at some of the related articles which I have published on my homepage.

- Create Histogram in Base R
- Create Kernel Density Plot in R
- Create ggplot2 Histogram in R
- Draw Multiple Overlaid Histograms with ggplot2 Package
- Draw Multiple Variables as Lines to Same ggplot2 Plot
- R Graphics Gallery
- The R Programming Language

In this tutorial, I illustrated how to **combine histograms with probability on the y-axis and density plots** in the R programming language. If you have additional questions or comments, let me know in the comments section below.

### Statistics Globe Newsletter

## 9 Comments. Leave new

Hello Joachim

Thanks for your nice videos. I have the following R script which is for only one .tsv file. I want to tweak it in a way that can plot (Histogram + line) two similar but separate .tsv files with different colours overlaid on each other. Could you please guide?

# read in data

df = read.csv(“your_distribution.tsv”, sep=”\t”)

# filter Ks distribution (0.001 < Ks < 5)

lower_bound = 0.001

upper_bound = 5

df = df[df$Ks lower_bound,]

# perform node-averaging (redo when applying other filters)

dff = aggregate(df$Ks, list(df$Family, df$Node), mean)

# reflect the data around the lower Ks bound to account for boundary effects

ks = c(dff$x, -dff$x + lower_bound)

# plot a histogram and KDE on top

hist(ks, prob=TRUE, xlim=c(0, upper_bound), n=50)

lines(density(ks), xlim=c(0, upper_bound))

Hello Ardy,

Thank you for following us! Would you like to draw two density lines overlaid on each other on a histogram?

Regards,

Cansu

Hello Cansu

Yes that is right.

Regards

Ardy

Hello Ardy,

Here is how to do it in two ways (via graphics and ggplot2 libraries):

Expanding the data frame for this example:

Using R graphics:

Using ggplot2 library:

I hope this answers your question. Let me know if you have any further comments.

Regards,

Cansu

Dear Cansu

Thanks a lot for your help. Sorry I am so new to R. Could you pls let me know how/where to fit these codes in the contest of the following, if possible?

# read in data

df = read.csv(“your_distribution1.tsv”, sep=”\t”)

df = read.csv(“your_distribution2.tsv”, sep=”\t”)

# filter Ks distribution (0.001 < Ks < 5)

lower_bound = 0.001

upper_bound = 5

df = df[df$Ks lower_bound,]

# perform node-averaging (redo when applying other filters)

dff = aggregate(df$Ks, list(df$Family, df$Node), mean)

# reflect the data around the lower Ks bound to account for boundary effects

ks = c(dff$x, -dff$x + lower_bound)

# plot a histogram and KDE on top

hist(ks, prob=TRUE, xlim=c(0, upper_bound), n=50)

lines(density(ks), xlim=c(0, upper_bound))

Sorry I am not sure about term overlay! But better to say overlap two graphs for comparison purpose.

Hello again:)

Could we say that you would like to plot multiple histograms on the same panel? Not fitting multiple curves?

Regards,

Cansu

Yes, that is correct.

You can check our tutorial Draw Multiple Overlaid Histograms with ggplot2 Package in R and this link.

I hope those help!

Regards,

Cansu