Convert Data Frame with Date Column to Time Series Object in R (Example)

 

In this article you’ll learn how to change the data frame class to the xts / zoo data type in the R programming language.

Table of contents:

Let’s take a look at some R codes in action!

 

Creation of Example Data

We use the following data as basement for this R tutorial:

data <- data.frame(date = c("2020-10-01",    # Create example data
                            "2021-07-08",
                            "2018-01-18",
                            "2018-05-05",
                            "2025-12-10"),
                   value = 1:5)
data                                         # Print example data
#         date value
# 1 2020-10-01     1
# 2 2021-07-08     2
# 3 2018-01-18     3
# 4 2018-05-05     4
# 5 2025-12-10     5

Have a look at the previous output of the RStudio console. It shows that our example data has two columns. The first variable contains dates formatted as character strings and the second variable contains some randomly selected values.

We can also check the class of our data:

class(data)                                  # Check class of data
# "data.frame"

As you can see, our example data has the data.frame class.

 

Example: Converting Data Frame to xts / zoo Object

This Example illustrates how to switch from data.frame class to a time series object (i.e. xts or zoo). First, we have to convert our character string variable to the Date class.

data$date <- as.Date(data$date)              # Convert character string column to date

We also need to install and load the xts package:

install.packages("xts")                      # Install & load xts package
library("xts")

We can use the xts function provided by the xts package to convert our data frame to a time series object as shown below:

data_ts <- xts(data$value, data$date)        # Convert data frame to time series
data_ts                                      # Print time series
#            [,1]
# 2018-01-18    3
# 2018-05-05    4
# 2020-10-01    1
# 2021-07-08    2
# 2025-12-10    5

Let’s check the data type of our updated data object:

class(data_ts)                               # Check class of time series
# "xts" "zoo"

As you can see, we switched the class from data.frame to xts / zoo.

 

Video, Further Resources & Summary

In case you need further explanations on the R programming code of the present tutorial, you could have a look at the following video of my YouTube channel. In the video, I show the content of this article in a live session.

 

 

In addition, you might want to have a look at the related posts of this website. I have released several articles already:

 

In summary: In this tutorial you learned how to convert data frames to times series objects in the R programming language. In case you have additional questions, please tell me about it in the comments section.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


44 Comments. Leave new

  • Abujam Manglem singh
    July 14, 2021 10:11 am

    I have yearly data for monthly temperature in celcius like:

    Year. Jan. Feb. Mar. Ap.
    2001. 5. 6
    2002 6 8
    2003. 7. 8
    2004. 5. 5.5
    2005 4.5. 6
    Etc. Etc. Etc

    How do i covert this data into time series so that ARIMA model can be use for forecasting values for a given month

    Reply
    • Hey Abujam,

      I’m sorry for the late response, I just came back from holidays and did not have the chance to read your message earlier.

      Are you still looking for a solution to this problem?

      Regards

      Joachim

      Reply
  • I have monthly data for sales of different stores like:
    store_code. year. month. sales. city …
    001. 2010. 1. 8899. ny
    001. 2010. 2. 8334. ny
    001. 2011 1. 8299. ny
    002. 2010. 1. 1399. la

    How can i convert this data into time series so that one ARIMA model can be built for forecasting values for all stores? Thanks in advance

    Reply
  • Hello Joachim
    I have data frame with 16 rows and 16383 columns and covert to create ts object using the date vector.
    I created d <- c("2000-03-01","2001-03-01","2002-03-01","2003-03-01","2004-03-01","2005-03-01","2006-03-01","2007-03-01","2008-03-01","2009-03-01","2010-03-01","2011-03-01","2012-03-01","2013-03-01","2014-03-01","2015-03-01")
    d <- as.Date(d)
    Not sure how I combine my with my dates.
    Really appreciate if you can help me

    Reply
  • Yes please. it tried to convert my to ts time series object. The function I am using requires univariate time series of class ts. I am using BFAST package for trend analysis.
    I created data frame from my date column and values. Then I used as.Date:
    NPP_t_date <- as.Date(NPP_t_date$date)
    # I dropped the first column to omit the name
    dataset_as_time_series <- ts(data = NPP_t_date[-1], start=2000, end=2015, frequency = 1)
    – first I tried to plot the time series and got this error (may be because I have 1415565 columns :
    Error in plotts(x = x, y = y, plot.type = plot.type, xy.labels = xy.labels, :
    cannot plot more than 10 series as "multiple"
    – I tried to run this function for gap fill NA —tsgf1 1 and only the first element will be used

    Reply
  • Hamad Mohamed
    December 2, 2021 1:03 am

    Hello Joachim
    Thank you. Would it be possible that you just help me in converting my column date and data column into time series object using ts function

    Reply
    • Hey Hamad,

      Does this work?

      d <- c("2000-03-01","2001-03-01","2002-03-01","2003-03-01","2004-03-01","2005-03-01","2006-03-01","2007-03-01","2008-03-01","2009-03-01","2010-03-01","2011-03-01","2012-03-01","2013-03-01","2014-03-01","2015-03-01")
      d <- as.Date(d)
       
      data_new <- data.frame(d, data_old)
       
      data_ts <- xts(data_new[ , - 1], data_new$d)

      Regards,
      Joachim

      Reply
  • Michael D McMillan
    December 20, 2021 4:24 am

    Hi Joachim.

    I have a date frame called “newallportfolios2” that contains 25 portfolios (as columns) and 1 column of dates each with the headings: “date”,”PR1″,”PR2″,”PR3″,”PR4″,”PR5″,”PR6″,”PR7″,”PR8″,”PR9″,”PR10″,”PR11″,”PR12″,”PR13″,”PR14″,”PR15″,”PR16″,”PR17″,”PR18″,”PR19″,”PR20″,”PR21″,”PR22″,”PR23″,”PR24″,”PR25″ (PR referring to portfolio return)

    I have also already converted the dates from a date frame to a Date using the following code:
    newallportfolios2$date <-as.Date(newallportfolios2$date).

    Im struggling with the last step of converting the dates and 25 portfolios into an xts. Any help would be appreciated.

    Reply
    • Hey Michael,

      Please have a look at the following example code:

      newallportfolios2 <- data.frame(date = c("2020-10-01",
                                               "2021-07-08",
                                               "2018-01-18",
                                               "2018-05-05",
                                               "2025-12-10"),
                                      PR1 = 1:5,
                                      PR2 = 6:10)
      newallportfolios2$date <- as.Date(newallportfolios2$date)
       
      library("xts")
       
      newallportfolios2_ts <- xts(newallportfolios2[ , colnames(newallportfolios2) != "date"], newallportfolios2$date)
       
      class(newallportfolios2_ts)
      # [1] "xts" "zoo"

      Does this solve your problem?

      Regards,
      Joachim

      Reply
  • Michael D McMillan
    December 20, 2021 4:25 am

    Dear Joachim,

    that first line should say “I have a data frame”.

    Apologies,
    Michael

    Reply
  • I am trying to convert a vector data to ts time series objects. I have my dates for every two years.
    2000 – 2002 – 2004-2006-2008-2010-2012-2014-2016-2018-2020. When I set my start = 2000, end = 2020, frequency =1. I get the results of my time series 2000 to 2005 only.
    I really appreciate if you can help

    Reply
    • Hey Hamad,

      I assume your data look like this?

      data <- data.frame(date = seq(2000, 2020, by = 2),
                         value = 10:20)
      data
      #    date value
      # 1  2000    10
      # 2  2002    11
      # 3  2004    12
      # 4  2006    13
      # 5  2008    14
      # 6  2010    15
      # 7  2012    16
      # 8  2014    17
      # 9  2016    18
      # 10 2018    19
      # 11 2020    20

      Then, you would first have to convert your four digit years to the Date class as shown below:

      data$date <- as.Date(ISOdate(data$date, 1, 1))
      data
      #          date value
      # 1  2000-01-01    10
      # 2  2002-01-01    11
      # 3  2004-01-01    12
      # 4  2006-01-01    13
      # 5  2008-01-01    14
      # 6  2010-01-01    15
      # 7  2012-01-01    16
      # 8  2014-01-01    17
      # 9  2016-01-01    18
      # 10 2018-01-01    19
      # 11 2020-01-01    20

      Note that we have added days and months to the date column.

      Next, we can use the xts function to convert the data frame to an xts object:

      library("xts")
       
      data <- xts(data$value, data$date)
      data
      #            [,1]
      # 2000-01-01   10
      # 2002-01-01   11
      # 2004-01-01   12
      # 2006-01-01   13
      # 2008-01-01   14
      # 2010-01-01   15
      # 2012-01-01   16
      # 2014-01-01   17
      # 2016-01-01   18
      # 2018-01-01   19
      # 2020-01-01   20

      I hope that helps!

      Joachim

      Reply
  • Thank you Joachim. It really helps a lot. I was able to run it for small vector (part of my data). However when I tried to run it for my large vector it didn’t work. My vector has 10000 values. I also tried to make it work with matrix of 11 rows and 3473608 columns didn’t work. I appreciate if you have any suggestion

    Reply
  • Here is what I have done)
    I have created data frame for my values with 11 rows and 3473608 columns (NPP.df.t.df)
    class(NPP.df.t.df)
    [1] “data.frame”
    Just to make sure I used str function: str(NPP.df.t.df)
    ‘data.frame’: 11 obs. of 3473608 variables:
    $ 522 : int 32765 32765 32765 32765 32765 32765 32765 32765 32765 32765 …
    $ 523 : int 32766 32766 32766 32766 32766 32766 32766 32766 32766 32766 …
    I created new data frame:
    data <- data.frame(date = seq(2000, 2020, by = 2), NPP.df.t.df)
    data$date <- as.Date(ISOdate(data$date, 1, 1))
    ***Not sure I think need to change the (1,1) because I have 3473608 columns for values dataframe.
    Also, I am not sure how to name all these large number of columns (can I leave it without names) to move to the last step : data <- xts(data$value, data$date)

    Reply
  • I also tried the following code:
    data$date data <- xts(data$NPP.df.t.df, data$date)
    I got this error:

    Error in xts(data$NPP.df.t.df, data$date) :
    'order.by' cannot contain 'NA', 'NaN', or 'Inf'

    Reply
    • Does it work when you use data <- xts(data[ , colnames(data) != "date"], data$date) instead of data <- xts(data$NPP.df.t.df, data$date) at the end of your code?

      Reply
  • Just to confirm. I supposed to change this code: as.Date(ISOdate(data$date, 1, 1))
    to as.Date(ISOdate(data$date, 1, 3473608))…..#1 is for the date column and 3473608 for my values. Also, my columns don’t have names.

    Reply
  • I run it again and received the following error:
    Found more than one class “xts” in cache; using the first, from namespace ‘quantmod’
    Also defined by ‘rts’ ‘spacetime’

    Reply
    • The 1, 1, within ISOdate stands for the first day and month of the year, so this should not be changed.

      You can name all your variables by using the following code:

      colnames(data) <- paste0("x", 1:ncol(data))

      I don’t know what this error message means, but it seems like it is explained here.

      Regards,
      Joachim

      Reply
  • Sorry for being annoying. I was able to run the codes you have provided. They are working but in final step I am not getting xts time series instead an xts object of zero width
    1. I created data frame for my values with 11 row and 3473608 colunms
    str(s.uname.t)
    int [1:11, 1:3473608] 32765 32765 32765 32765 32765 32765 32765 32765 32765 32765 …
    – attr(*, “dimnames”)=List of 2
    ..$ : NULL
    ..$ : chr [1:3473608] “x1” “x2” “x3” “x4” …

    2. I run this code to add columns names
    colnames(s.uname.t) <- paste0("x", 1:ncol(s.uname.t))

    3. created a data frame for date and values (which is my data frame from the previous step
    data <- data.frame(date = seq(2000, 2020, by = 2),
    s.uname.t)
    str(data)
    data.frame': 11 obs. of 3473609 variables:
    $ date : num 2000 2002 2004 2006 2008 …
    $ x1 : int 32765 32765 32765 32765 32765 32765 32765 32765 32765 32765 …
    $ x2 : int 32766 32766 32766 32766 32766 32766 32766 32766 32766 32766 …

    4. converted to get date class
    data$date <- as.Date(ISOdate(data$date, 1, 1))

    str(data$date)
    Date[1:11], format: "2000-01-01" "2002-01-01" "2004-01-01" "2006-01-01" "2008-01-01" "2010-01-01" …
    class(data$date)
    [1] "Date"

    5. converted to xts time series
    data <- xts(data$s.uname.t, data$date)
    str(data)
    An 'xts' object of zero-width

    Reply
  • Just forget to add that when I tried to plot I got this message
    plot(data)
    Error in if (length(col) < ncol(x)) col <- rep(col, length.out = ncol(x)) :
    argument is of length zero

    Reply
  • Hi Joachim,
    I need your help to show how to exclude specific dates and hours rows from a long data frame (6000 rows x 80 variables). I need to exclude about 120 hours in different days. Each row is representing hourly average measurements, the first column is date/time format, i.e., “2021-12-31 16:00”

    Reply
  • Thank you for your concern. I was able to run the xts timeseries but I think because it’s very large even though I have large memory in laptop. I am trying to find a method or package that might help speeding the process. it is just taking very long time.

    Reply
    • Glad it works now, even though it’s slow. Unfortunately, I don’t have an idea how to speed up this process. Let me know in case you find a faster alternative, I’m also curious now!

      Regards,
      Joachim

      Reply
  • Sir ,
    I got stuck with this kind of data format and i wanted it to be in ususal date column and value
    where each column is a month.
    2016 1.4 1.0 0.9 1.1 1.0 1.0 0.8 1.1 1.5 1.6 1.7 2.1 1.3
    2017 2.5 2.7 2.4 2.2 1.9 1.6 1.7 1.9 2.2 2.0 2.2 2.1 2.1
    2018 2.1 2.2 2.4 2.5 2.8 2.9 2.9 2.7 2.3 2.5 2.2 1.9 2.4
    2019 1.6 1.5 1.9 2.0 1.8 1.6 1.8 1.7 1.7 1.8 2.1 2.3 1.8
    2020 2.5 2.3 1.5 0.3 0.1 0.6 1.0 1.3 1.4 1.2 1.2 1.4 1.2
    2021 1.4 1.7 2.6 4.2 5.0 5.4 5.4 5.3 5.4 6.2 6.8 7.0 4.7

    I have forgotten on the steps to make it in ususal format.

    So how one should proceed for formatting such data

    Reply
  • Sir,
    I realized that the 13th value is the average column.

    Reply
    • Hi Mandar,

      How should this column be formatted in the final output? Could you also share the class of your current data? What is returned when you execute the following code:

      head(your_data)
      class(your_data)

      Regards,
      Joachim

      Reply
  • Dear Joachim,

    First of all, thanks a lot for your great videos and tutorials.

    I am trying to change a Data.Frame to a XTS objective. At the same time, I want to transpose some rows to columns, and reorganize the data by a column `date`.

    This is the current Data.Frame I have:

    Date Ticker Price
    2020-01-01 AAPL 90.12
    2020-01-02 AAPL 90.40
    2020-01-03 AAPL 89.70
    2020-01-01 IBM 40.29
    2020-01-02 IBM 39.78
    2020-01-03 IBM 41.01

    This is the XTS object that I want to create:

    Date AAPL.Price IBM.Price
    2020-01-01 90.12 40.29
    2020-01-02 90.40 39.78
    2020-01-03 89.70 41.01

    Do you have any advice or suggestions, on how I can do this?

    Many thanks in advance!

    Kind regards,
    Max

    Reply
    • Hi Max,

      Thank you very much for the kind feedback, glad to hear that!

      I’m sorry for the delayed response. I was on a long vacation, so unfortunately I wasn’t able to get back to you earlier. Do you still need help with your syntax?

      Regards,
      Joachim

      Reply
      • Dear Joachim,

        thank you for your reply and I hope you had a good holiday.

        All fine, I already managed to get it solved in some way 🙂
        Not specifically with this syntax but by access to a different dataset.

        Best regards,
        Max

        Reply
  • I have a data frame with date and variable…
    > HE_ratio_totP_ts
    SampleDate HE_ratio
    1995-06-06 3.701389
    1995-07-06 8.271739
    1995-08-02 14.437500
    1995-08-30 24.850000
    1996-02-06 2.477477
    1996-06-10 1.958763
    1996-07-08 3.948529
    1996-08-05 12.000000
    1996-09-05 41.107143
    1997-02-17 1.227642

    I have biweekly data for some of the years during the summer and monthly data for other years during the summer. I would like to convert this data frame to a time series and interpolate observations for every 14 days. Do you have any suggestions?

    Reply
    • Hello Lauren,

      I haven’t worked with time series personally, but I found this solution that could be useful.

      # install.packages("dplyr")
      # install.packages("lubridate")
       
      library(dplyr)
      library(lubridate)
       
      data <- data.frame(
        SampleDate = as.Date(c("1995-06-06", "1995-07-06", "1995-08-02", "1995-08-30",
                               "1996-02-06", "1996-06-10", "1996-07-08", "1996-08-05",
                               "1996-09-05", "1997-02-17")),
        HE_ratio = c(3.701389, 8.271739, 14.437500, 24.850000,
                     2.477477, 1.958763, 3.948529, 12.000000,
                     41.107143, 1.227642)
      )
       
      date_range <- seq(min(data$SampleDate), max(data$SampleDate), by = "14 days")
       
      interpolated <- approx(data$SampleDate, data$HE_ratio, xout = date_range)
       
      HE_ratio_interpolated_df <- data.frame(
        SampleDate = as.Date(interpolated$x),
        HE_ratio = interpolated$y
      )
       
      head(HE_ratio_interpolated_df)
      #   SampleDate  HE_ratio HE_ratio.1
      # 1 1995-06-06  3.701389   3.701389
      # 2 1995-06-20  5.834219   5.784382
      # 3 1995-07-04  7.967049   7.942823
      # 4 1995-07-18 11.012077  10.486741
      # 5 1995-08-01 14.209138  14.114537
      # 6 1995-08-15 19.271875  19.284132

      The approx() function helps to linearly interpolate data with irregular intervals, just like in your case. As seen, it takes the irregular dates and returns to regular biweekly dates. You can visualize the results as follows:

      plot(HE_ratio_interpolated_df$SampleDate, HE_ratio_interpolated_df$HE_ratio)

      But be aware that this function does linear interpolation. You need different functions for different degrees of interpolation. For instance, spline() for the cubic one.

      Regards,
      Cansu

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top