How to calculate the difference between dates in Python Pandas?

In this tutorial we’ll learn how to calculate the time difference between two dates in a Pandas DataFrame. We’ll look into several time difference calculations: in days, months and years.

Example DataFrame

We will get started by creating a simple DataFrame that you can use in order to follow along with this example.

import pandas as pd
import numpy as np

office = ['Denver','Paris', 'London' ]
hire_date = ['1-17-2022', '4-7-2022', '2-18-2022']
start_date = ['9-17-2022', '4-29-2022', '6-19-2022']
hiring_dict  = dict( office=office,hire_date=(hire_date),start_date = start_date)
hiring_df = pd.DataFrame(data=hiring_dict, dtype='datetime64[ns]')


Here’s our data:


Calculate delta between dates in days

We can easily subtract two datetime columns in a Pandas DataFrame to get the time diff in days. Here’s a short snippet for our case:

hiring_df['diff_days'] = hiring_df['start_date']- hiring_df['hire_date']

Time difference in minutes, hours, months and years

In order to calculate time differences for any other time frequencies other than days, we’ll need to use the np.timedelta() Numpy array method. Ensure that numpy is installed in your Data Analysis environment before running this example.

For months:

#import numpy as np

hiring_df['diff_months'] = \ (hiring_df['diff_days']/np.timedelta64(1, 'M')).astype('float').round(1)

Note that the resulting calculation is provided as a float number, rounded up to one decimal. If we would like to show the calculation as an integer we could cast the resulting time difference column to the integer data type as shown below:

hiring_df['diff_months'] = \
(hiring_df['diff_days']/np.timedelta64(1, 'M')).astype('int8')

For years:

hiring_df['diff_years'] = \ (hiring_df['diff_days']/np.timedelta64(1, 'Y')).astype('float').round(1)

For hours:

hiring_df['diff_hours'] = \ (hiring_df['diff_days']/np.timedelta64(1, 'h')).astype('float').round(1)

Note: Calculating the difference in minutes is easy accomplished by multiplying the hours result by 60.

In summary, here’s our resulting DataFrame:

0Denver2022-01-172022-09-17243 days8.00.75832.0
1Paris2022-04-072022-04-2922 days0.70.1528.0
2London2022-02-182022-06-19121 days4.00.3290