Sort pandas DataFrame by Date in Python (Example)
In this tutorial, I’ll show how to order the rows of a pandas DataFrame by dates in Python programming.
The post will contain these content blocks:
Let’s start right away…
Example Data & Software Libraries
In order to use the functions of the pandas library, we first have to import pandas.
import pandas as pd # Import pandas
As a next step, we’ll also have to create some data that we can use in the example syntax below:
data = pd.DataFrame({'dates':['02/25/2023','01/01/2024','17/11/2019','17/10/2022','02/02/2022'], # Create example DataFrame 'values':range(1, 6)}) print(data) # Print example DataFrame
Have a look at the previous table. It shows that our example data consists of five rows and the two columns “dates” and “values”.
Example: Order pandas DataFrame by Dates Using to_datetime() & sort_values() Functions
The following code illustrates how to reorder the rows of a pandas DataFrame by a date column.
For this, we first should create a copy of our example data, so that we can keep an original version of our data:
data_new = data.copy() # Create duplicate
In the next step, we have to convert the class of our date column to the datetime class using the to_datetime function. This is important for the sorting process later on, because otherwise Python would sort our data alphabetically.
data_new['dates'] = pd.to_datetime(data_new.dates) # Convert to date
Next, we can apply the sort_values function to order our pandas DataFrame according to our date variable:
data_new = data_new.sort_values(by = ['dates']) # Sort rows of data
Finally, let’s print our data to the console:
print(data_new) # Print new data set
The output of the previous Python programming code is shown in Table 2: As you can see, we have sorted our pandas DataFrame by its date column.
Video & Further Resources on this Topic
Do you need more explanations on how to order and rearrange rows by date? Then you should have a look at the following YouTube video of the Statistics Globe YouTube channel. I show and eyplain the examples of this tutorial in some more detail.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
For further examples, check out the following video of Corey Schafer’s YouTube channel. In the video, he explains how to sort pandas DataFrames using the Python programming language. You can find the video below:
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Furthermore, you may want to read the other Python programming tutorials on my homepage:
- pandas Library Tutorial in Python
- Change Order of Columns in pandas DataFrame
- Sort pandas DataFrame by Multiple Columns in Python
- Change pandas DataFrames in Python
- Manipulate pandas DataFrames in Python
- Sort pandas DataFrame by Column in Python
- Python Programming Tutorials
This article has illustrated how to sort and rearrange a pandas DataFrame by year, month, and day in a date column in Python programming. Tell me about it in the comments below, in case you have any additional questions. Besides that, please subscribe to my email newsletter in order to receive updates on new tutorials.
2 Comments. Leave new
Hi, thanks for this excellent tutorial!
I repeated steps using your example and I worked well.
But, then I tried with a dataframe built from reading a excel spreadsheet and it did not work so well.
Dates from the spreadsheet are in ‘dd/mm/YYYY’ format.
Below is a shortened version of the Dataframe:
df = {‘Data’: {0: ’29/09/2021′, 1: ’14/10/2021′, 2: ’18/10/2021′, 3: ’26/10/2021′, 4: ’27/10/2021′, 5: ’05/11/2021′, 6: ’02/12/2021′, 7: ’01/01/2022′, 8: ’25/01/2022′, 9: ’03/02/2022′, 10: ’29/03/2022′, 11: ’29/09/2021′, 12: ’14/10/2021′, 13: ’18/10/2021′, 14: ’26/10/2021′, 15: ’27/10/2021′, 16: ’05/11/2021′, 17: ’02/12/2021′, 18: ’01/01/2022′, 19: ’25/01/2022′, 20: ’03/02/2022′}, ‘Hora’: {0: ’11:23:45′, 1: ’09:15:38′, 2: ’15:21:09′, 3: ’15:24:23′, 4: ’15:24:33′, 5: ’15:25:44′, 6: ’15:25:54′, 7: ’15:26:20′, 8: ’15:26:30′, 9: ’15:27:00′, 10: ’15:27:07′, 11: ’15:51:11′, 12: ’10:36:19′, 13: ’10:39:19′, 14: ’10:42:18′, 15: ’09:09:32′, 16: ’09:09:59′, 17: ’09:12:25′, 18: ’09:13:17′, 19: ’11:45:14′, 20: ’13:15:11′}, ‘Status’: {0: ‘Alarme’, 1: ‘Alarme’, 2: ‘Alarme’, 3: ‘Alarme’, 4: ‘Alarme’, 5: ‘Alarme’, 6: ‘Falha’, 7: ‘Falha’, 8: ‘Falha’, 9: ‘Falha’, 10: ‘Falha’, 11: ‘Falha’, 12: ‘Falha’, 13: ‘Falha’, 14: ‘Falha’, 15: ‘Notificação’, 16: ‘Notificação’, 17: ‘Notificação’, 18: ‘Notificação’, 19: ‘Notificação’, 20: ‘Notificação’}}
When I run to_datetime() function data from ‘Data’ column is converted to datatime64[ns].
But when I try to sort data based on ‘Data’ column values it mess up the data completely.
Follows below full code for reference:
>import os, sys, pandas
>header_list = [‘Data’, ‘Hora’, ‘Status’]
>df = {‘Data’: {0: ’29/09/2021′, 1: ’14/10/2021′, 2: ’18/10/2021′, 3: ’26/10/2021′, 4: ’27/10/2021′, 5: ’05/11/2021′, 6: ’02/12/2021′, 7: ’01/01/2022′, 8: ’25/01/2022′, 9: ’03/02/2022′, 10: ’29/03/2022′, 11: ’29/09/2021′, 12: ’14/10/2021′, 13: ’18/10/2021′, 14: ’26/10/2021′, 15: ’27/10/2021′, 16: ’05/11/2021′, 17: ’02/12/2021′, 18: ’01/01/2022′, 19: ’25/01/2022′, 20: ’03/02/2022′}, ‘Hora’: {0: ’11:23:45′, 1: ’09:15:38′, 2: ’15:21:09′, 3: ’15:24:23′, 4: ’15:24:33′, 5: ’15:25:44′, 6: ’15:25:54′, 7: ’15:26:20′, 8: ’15:26:30′, 9: ’15:27:00′, 10: ’15:27:07′, 11: ’15:51:11′, 12: ’10:36:19′, 13: ’10:39:19′, 14: ’10:42:18′, 15: ’09:09:32′, 16: ’09:09:59′, 17: ’09:12:25′, 18: ’09:13:17′, 19: ’11:45:14′, 20: ’13:15:11′}, ‘Status’: {0: ‘Alarme’, 1: ‘Alarme’, 2: ‘Alarme’, 3: ‘Alarme’, 4: ‘Alarme’, 5: ‘Alarme’, 6: ‘Falha’, 7: ‘Falha’, 8: ‘Falha’, 9: ‘Falha’, 10: ‘Falha’, 11: ‘Falha’, 12: ‘Falha’, 13: ‘Falha’, 14: ‘Falha’, 15: ‘Notificação’, 16: ‘Notificação’, 17: ‘Notificação’, 18: ‘Notificação’, 19: ‘Notificação’, 20: ‘Notificação’}}
>df = pandas.DataFrame(df)
>df[‘Data’] = pandas.to_datetime(df.Data)
>df[‘Data’] = df[‘Data’].dt.strftime(‘%Y-%d-%m’)
>df = df.sort_values(by = [‘Data’])
>print(df)
Would you know why does that happen?
Would that be due to datetime format from my dataset?
Yours looks like a mixture of ‘mm/dd/YYYY’ and ‘dd/mm/YYYY’
Thanks!
Hey Guilherme,
Thanks a lot for the kind words, glad you like the tutorial!
It seems like the problem in your case is that your data is not formatted as a pandas DataFrame. Please have a look at the simplified example below:
Note that I have created a pandas DataFrame at the beginning of this example.
Regards,
Joachim