Plot All Columns of Data Frame in R (3 Examples) | How to Draw Each Variable

 

In this tutorial, I’ll explain how to draw all variables of a data set in a line plot in the R programming language.

Table of contents:

Let’s dive right in!

 

Creation of Example Data

Let’s first create some example data:

set.seed(987425)                                              # Create example data
data <- data.frame(x = 1:10,
                   y1 = rnorm(10),
                   y2 = runif(10),
                   y3 = rpois(10, 1))
data                                                          # Print example data
#     x          y1        y2 y3
# 1   1 -1.19464442 0.6631678  2
# 2   2 -0.27292005 0.9540095  0
# 3   3 -0.05134384 0.6712889  1
# 4   4  0.45500651 0.1736061  1
# 5   5 -2.07007318 0.2290419  0
# 6   6  0.92083477 0.3240386  0
# 7   7 -0.26656251 0.2139329  0
# 8   8  0.10529478 0.7744575  1
# 9   9 -2.17999010 0.6029383  1
# 10 10 -1.51876252 0.8177035  0

As you can see based on the previous output of the RStudio console, our example data contains ten rows and four columns. The variable x is ranging from 1 to 10 and defines the x-axis for each of the other variables.

 

Example 1: Drawing Multiple Variables Using Base R

The following code shows how to draw a plot showing multiple columns of a data frame in a line chart using the plot R function of Base R. Have a look at the following R syntax:

plot(data$x, data$y1, type = "l", col = 1, ylim = c(- 3, 3))  # Plot with Base R
lines(data$x, data$y2, type = "l", col = 2)
lines(data$x, data$y3, type = "l", col = 3)

 

r graph figure 1 plot all columns data frame

 

As shown in Figure 1, we created a Base R line plot showing three lines with the previous code.

 

Example 2: Drawing Multiple Variables Using ggplot2 Package

Example 2 illustrates how to use the ggplot2 package to create a graphic containing the values of all data frame columns. First, we need to reshape our data frame to long format:

data_ggp <- data.frame(x = data$x,                            # Reshape data frame
                       y = c(data$y1, data$y2, data$y3),
                       group = c(rep("y1", nrow(data)),
                                 rep("y2", nrow(data)),
                                 rep("y3", nrow(data))))
head(data_ggp)                                                # Head of reshaped data frame
#   x           y group
# 1 1 -1.19464442    y1
# 2 2 -0.27292005    y1
# 3 3 -0.05134384    y1
# 4 4  0.45500651    y1
# 5 5 -2.07007318    y1
# 6 6  0.92083477    y1

Furthermore, we have to install and load the ggplot2 package, if we want to use the corresponding functions:

install.packages("ggplot2")                                   # Install & load ggplot2
library("ggplot2")

Now, we can draw a ggplot2 line graph with the following R code:

ggp <- ggplot(data_ggp, aes(x, y, col = group)) +             # Create ggplot2 plot
  geom_line()
ggp                                                           # Draw plot

 

r graph figure 2 plot all columns data frame

 

As shown in Figure 2, we plotted a graph showing a different line for each variable with the previous R programming code. On the right side of the plot, we have also created a legend illustrating the different groups of our data.

 

Example 3: Drawing Multiple Variables in Different Panels with ggplot2 Package

In Example 3, I’ll show how to draw each of our columns in a different panel of a facet plot. For this, we simply need to add the facte_grid function to our previously created graph of Example 2:

ggp + facet_grid(group ~ .)                                   # Draw plot in different panels

 

r graph figure 3 plot all columns data frame

 

As shown in Figure 3, the previous syntax created a facet plot using the ggplot2 package.

 

Video & Further Resources

Would you like to know more about the plotting of columns? Then you could watch the following video of my YouTube channel. I’m explaining the examples of this article in the video:

 

 

In addition, you could have a look at the related articles of this website.

 

To summarize: In this R programming tutorial you learned how to draw each column of a data matrix in a graphic. Don’t hesitate to let me know in the comments section below, if you have any additional questions.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


14 Comments. Leave new

  • Simon G. Gichuki
    April 7, 2021 2:18 pm

    Hallow. Hope you are doing well. Thanks for the illustration. I have tried to follow the steps to plot a similar graph but am not successful. Below is the kind of data I want to plot, Chainage Vs the other columns.
    Kindly take me through the steps. This is my first time in using R.
    Regards, and Thanks

    Chainage G.L EL HGL
    0 1940 1945 1944.971367
    134 1940.61 1944.368988 1944.340355
    152 1941.38 1944.284225 1944.255592
    165 1940.47 1944.223008 1944.194374
    229 1940.12 1943.921629 1943.892995
    308 1921.55 1943.680259 1943.663921
    370 1923.74 1943.490829 1943.474491
    376 1925.72 1943.472497 1943.456159
    389 1925.57 1943.434553 1943.419002
    406 1925.67 1943.387207 1943.372424
    447 1925.47 1943.278384 1943.264349
    569 1917.79 1942.970171 1942.956865
    614 1919.37 1942.862111 1942.849515
    631 1921.87 1942.821288 1942.808692
    648 1924.02 1942.782541 1942.770635
    661 1925.3 1942.754459 1942.743224
    1019 1927.97 1942.178456 1942.170282
    1191 1923.93 1941.460382 1941.442796
    1376 1927.07 1940.738343 1940.72199
    1376 1927.07 1940.738343 1940.723179
    1422 1927.61 1940.582658 1940.568638
    1510 1927.31 1940.284826 1940.270806
    1771 1923.24 1939.465804 1939.452883
    1913 1925.74 1939.086525 1939.075668
    2010 1925.38 1938.82744 1938.816583
    2090 1923.64 1938.613761 1938.602904
    2228 1923.05 1938.275593 1938.265701
    2241 1925.52 1938.243737 1938.233844
    2510 1923.53 1936.711852 1936.692537
    2717 1928.88 1935.533041 1935.513726
    2746 1929.88 1935.367893 1935.348579

    Reply
    • Hi Simon,

      Thank you, I’m fine and you?

      Your data seems to be formatted well for the code of Example 1. You would simply have to replace the name of the data and the names of the columns.

      At which point do you get problems?

      Regards

      Joachim

      Reply
  • Syed Hasnain Raza
    October 28, 2021 4:47 pm

    Try this one, using tidyr (Tidy Messy Data).

    df %>% select(2:3,1) %>% # select used for selected the columns
    tidyr::gather(“id”, “value”, 1:3) %>% # transforming data into cartesian form
    ggplot(aes(Chainage, value, color=id))+ # Ploting chainage vs other columns
    geom_point(size=2, alpha = 0.3)+ # Data point size and transperency
    geom_line(size=1)+ # line size
    scale_y_continuous(trans = “log”) + # used the log for (Y axis) better presentation
    theme_minimal() # theme style

    Reply
  • Hello Joachim,

    I am trying to plot a bar chart for the data frame below. This is the first time I use R.

    There is actually more data than shown. The collection period ran from Jul/01/21 to Nov/30/21. Is there a way to plot a bar chart for this data frame using ggplot with x-axis is the DateTime column?

    Many thanks

    Chuck

    DateTime A0Z9J8CY_3N_39Ft A2VLJXJQ_12N_18Ft AJGQGG3Q_20N_18Ft A8LGKQ1P_13S_19Ft A5QRV0S1_19S_14Ft AQRMJZYP_6S_26Ft La Casa_AirNow Globeville_AirNow AJ9MZTHB_25_Globeville
    7/1/2021 1:00 2.52 1.87 3.72 4.03 3.25 2.95 4.8 5.1 2.18
    7/1/2021 2:00 2.12 2.16 3.56 4.56 3.54 2.78 4.2 4.3 2.46
    7/1/2021 3:00 2.72 2.19 3.55 3.98 3.74 3.22 3.9 4.6 2.86
    7/1/2021 4:00 6.61 5.24 8.14 5.18 4.87 5.28 4.2 9.4 6.63
    7/1/2021 5:00 8.38 6.95 12.13 10.45 12.78 9.98 4.5 7.7 6.84
    7/1/2021 6:00 6.16 5.72 7.15 8.76 9.79 6.29 4.8 7.3 7.38

    Reply
    • Hey Chuck,

      Welcome to the R community! 🙂

      You may create a barplot of your datetime column and the A0Z9J8CY_3N_39Ft column using the following R code:

      data <- data.frame(datetime = c("7/1/2021 1:00",
                                      "7/1/2021 2:00",
                                      "7/1/2021 3:00"),
                         A0Z9J8CY_3N_39Ft = c(2.52,
                                              2.12,
                                              2.72))
       
      library("ggplot2")
       
      ggplot(data, aes(x = datetime,
                       y = A0Z9J8CY_3N_39Ft)) +
        geom_col()

       

      barchart

       

      Regards,
      Joachim

      Reply
  • Joachim,

    Thank you so much for your quick response. When plotting at each location, it worked. However, when plotting at multi-locations, my code did not work. Please see the code below. I also tried to insert the graph but couldn’t

    The x axis did not show any datetime. The y axis did not show the correct scale.

    AQ %
    gather(key = location, value = aqdata, 2:10)
    AQ

    # creating graph
    bar <- ggplot(AQ,aes(x=DateTime), na.rm=TRUE)+geom_bar(aes(fill=location))+
    labs(x = "Date & Time", y = "Air Quality", title = "Air Quality at Multi-Locations")
    bar

    Reply
    • Hey Chuck,

      Please see the updated example below:

      data <- data.frame(datetime = c("7/1/2021 1:00",
                                      "7/1/2021 2:00",
                                      "7/1/2021 3:00"),
                         location = rep(c("LocA", "LocB"), each = 3),
                         A0Z9J8CY_3N_39Ft = c(2.52,
                                              2.12,
                                              2.72,
                                              2.55,
                                              2.21,
                                              2.59))
       
      library("ggplot2")
       
      ggplot(data, aes(x = datetime,
                       y = A0Z9J8CY_3N_39Ft,
                       fill = location)) +
        geom_bar(stat = "identity",
                 position = "dodge")

       

      grouped barchart

       

      Regards,
      Joachim

      Reply
  • Thank you very much, Joachim. I’ll give it a try and will let you know how it turns out.

    Reply
  • Md. Alamgir Hossain
    June 18, 2023 7:41 am

    Hi,
    Thanks for your amazing support. I am having problem with this data to plot as I am new in this field. Can you please help me to figure it out?

    Fraction 1 Fraction 2 Fraction 3 Fraction 4 Concentration
    0.6276 0.6565 0.6744 0.5078 20
    0.5168 0.7521 0.6432 0.3279 10
    0.3746 0.6263 0.4668 0.2258 5
    0.2574 0.5070 0.3186 0.1665 2.5
    0.2019 0.3683 0.2251 0.1529 1.25
    0.1512 0.2496 0.1759 0.1485 0.625
    0.1504 0.2149 0.1663 0.1591 0.3125
    0.1549 0.1462 0.1576 0.1722 0

    My data is like this. I want to plot the fractions against concentration. What could be my approach? Note, the data str is numeric. It would be a great help if you could help me with this issue.

    Reply
  • Md. Alamgir Hossain
    June 22, 2023 1:20 pm

    Hi, my data is like this. Now, I need to take the average value of A,B, and C for “Fraction 1” and D, E, and F for “Fraction 2” and then need to plot these two fractions against the concentration in the same plot using line graph. My questions are
    1. How can I calculate the average and plot it?
    2. Do I need to plot the SD as well? Is yes, how?

    conc. (%) A B C D E F
    20.0000 0.7150 0.6014 0.5665 0.5954 0.6863 0.6877
    10.0000 0.5245 0.5139 0.5119 0.7416 0.7640 0.7506
    5.0000 0.4305 0.3585 0.3347 0.6231 0.5843 0.6716
    2.5000 0.2941 0.2592 0.2189 0.4809 0.5117 0.5285
    1.2500 0.2257 0.1990 0.1809 0.3375 0.3670 0.4005
    0.6250 0.1638 0.1439 0.1460 0.2189 0.2472 0.2827
    0.3125 0.1673 0.1372 0.1466 0.2011 0.2051 0.2385
    0.0000 0.1565 0.1449 0.1634 0.1556 0.1459 0.1371

    Reply
    • Hello Alamgir,

      Showing standard error can be reasonable, depending on what you want to show. The standard deviation is appropriate if you want to give a sense of the spread of the individual data points around the mean. But you can also prefer to show standard errors (or confidence intervals) instead if you want to show how accurately you believe your sample mean represents the population mean. Here I share the code showing the standard errors by error bars per average point.

      # Required Libraries
      library(ggplot2)
      library(dplyr)
       
      # Data
      data <- data.frame(
        'conc' = c(20, 10, 5, 2.5, 1.25, 0.625, 0.3125, 0),
        'A' = c(0.7150, 0.5245, 0.4305, 0.2941, 0.2257, 0.1638, 0.1673, 0.1565),
        'B' = c(0.6014, 0.5139, 0.3585, 0.2592, 0.1990, 0.1439, 0.1372, 0.1449),
        'C' = c(0.5665, 0.5119, 0.3347, 0.2189, 0.1809, 0.1460, 0.1466, 0.1634),
        'D' = c(0.5954, 0.7416, 0.6231, 0.4809, 0.3375, 0.2189, 0.2011, 0.1556),
        'E' = c(0.6863, 0.7640, 0.5843, 0.5117, 0.3670, 0.2472, 0.2051, 0.1459),
        'F' = c(0.6877, 0.7506, 0.6716, 0.5285, 0.4005, 0.2827, 0.2385, 0.1371)
      )
       
      # Calculating averages and SD
      data <- data %>%
        mutate(Fraction1 = rowMeans(select(., A:C)),
               Fraction2 = rowMeans(select(., D:F)),
               Fraction1Sd = apply(select(., A:C), 1, sd),
               Fraction2Sd = apply(select(., D:F), 1, sd))
       
      # Plot
      ggplot() +
        geom_line(data = data, aes(x = conc, y = Fraction1), color = "blue") +
        geom_line(data = data, aes(x = conc, y = Fraction2), color = "red") +
        geom_errorbar(data = data, aes(x = conc, ymin = (Fraction1 - Fraction1Sd), ymax = Fraction1 + Fraction1Sd), color = "blue", width = 0.2) +
        geom_errorbar(data = data, aes(x = conc, ymin = (Fraction2 - Fraction2Sd), ymax = Fraction2 + Fraction2Sd), color = "red", width = 0.2) +
        labs(x = "Concentration (%)", y = "Average Value") +
        theme_minimal()

      Best,
      Cansu

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top