How to convert Pandas columns to datetime format?

When wrangling data extracted from different sources, you might need to change the data types of several columns from standard Pandas objects / Python strings to the datatime64 format.

In this short tutorial we’ll look into different cases:

  • Casting entire column to Datatime
  • Handling column conversion to dates with read_csv
  • Keep only dates when converting a column
  • Set a dateindex

Create example dataframe

As we typically do, we’ll create a simple Pandas DataFrame for you to follow along this example.

import pandas as pd

hire_date = ['1-1-2022', '15-1-2022', '25-1-2022', '25-2-2022']
office = ['Atlanta','Atlanta', 'Boston', 'London' ]
salary = [153, 143, 188, 199]
 

hiring_dict  = dict( office=office,hire_date=hire_date,salary = salary)
hiring = pd.DataFrame(data=hiring_dict)
hiring.head()

officehire_datesalary
0Atlanta1-1-2022153
1Atlanta15-1-2022143
2Boston25-1-2022188
3London25-2-2022199

Let’s take a look at the DataFrame columns types:

hiring.dtypes
office       object
hire_date    object
salary        int64
dtype: object

The hire_date column data type is object. We would like to cast the column to the datetime64 Pandas type.

Pandas Dataframe column to Datetime

We’ll use the pd.to_datetime DataFrame method to cast the column. We will pass the Date format using the format parameter.

hiring['hire_date_dt'] = pd.to_datetime(hiring['hire_date'],  format = '%d-%m-%Y')

hiring.head()

This will add a new column to the DataFrame containing the dates in datetime64 fomat.

officehire_datesalaryhire_date_dt
0Atlanta1-1-20221532022-01-01
1Atlanta15-1-20221432022-01-15
2Boston25-1-20221882022-01-25
3London25-2-20221992022-02-25

Instead of defining a new column, we can simply replace the current column:

hiring['hire_date'] = pd.to_datetime(hiring['hire_date'],  format = '%d-%m-%Y')

Convert column to datatime with read_csv

When importing data from a csv file, you can automatically cast specific or multiple columns to the Datatime type. Here’s a simple example in which we use the parse_dates parameter to define the column to be converted.

hiring3 = pd.read_csv('hiring.csv', parse_dates = 'hire_date')

Convert to datetime with astype and keep the dates only

There will be instances in which you have a column that contains date values which contain also hours, minutes and seconds. Let’s quickly add a new column to our DataFrame.

hiring['full_date'] = ['1-1-2022 13:45:00', '15-1-2022 08:45:00', '25-1-2022 12:45:00', '25-2-2022 13:46:00']
hiring['full_date'] = hiring['full_date'].astype('datetime64')

As we are interested to see only the dates, but not the hours, minutes, seconds and so forth, we’ll extract the date only with the dt accessor:

hiring['date'] = hiring['full_date'].dt.date
hiring['date'].head()

Here’s our DataFrame:

officehire_datesalaryhire_date_dtfull_datedate_only
0Atlanta1-1-20221532022-01-012022-01-01 13:45:002022-01-01
1Atlanta15-1-20221432022-01-152022-01-15 08:45:002022-01-15
2Boston25-1-20221882022-01-252022-01-25 12:45:002022-01-25
3London25-2-20221992022-02-252022-02-25 13:46:002022-02-2

Set a datetime index to the DataFrame

We can now go ahead and define one of our datetime columns as the DataFrame index. For example:

hiring.index = hiring['hire_date_dt']

Additional learning