How to convert Pandas columns to datetime format?

When wrangling data extracted from different sources, you might need to change the data types of several columns from standard Pandas objects / Python strings to the datatime64 format.

In this short tutorial we’ll look into different cases:

  • Casting entire column to Datatime
  • Handling column conversion to dates with read_csv
  • Keep only dates when converting a column
  • Change the format of a datetime object
  • Set a dateindex

Create the example dataframe

As we typically do, we’ll create a simple Pandas DataFrame for you to follow along this example.

import pandas as pd

hire_date = ['1-1-2022', '15-1-2022', '25-1-2022', '25-2-2022']
office = ['Atlanta','Atlanta', 'Boston', 'London' ]
salary = [153, 143, 188, 199]
 

hiring_dict  = dict( office=office,hire_date=hire_date,salary = salary)
hiring = pd.DataFrame(data=hiring_dict)
hiring.head()

officehire_datesalary
0Atlanta1-1-2022153
1Atlanta15-1-2022143
2Boston25-1-2022188
3London25-2-2022199

Let’s take a look at the DataFrame columns types:

hiring.dtypes
office       object
hire_date    object
salary        int64
dtype: object

The hire_date column data type is object. We would like to cast the column to the datetime64 Pandas type.

Pandas Dataframe column to Datetime

We’ll use the pd.to_datetime DataFrame method to cast the column. We will pass the Date format using the format parameter.

hiring['hire_date_dt'] = pd.to_datetime(hiring['hire_date'],  format = '%d-%m-%Y')

hiring.head()

This will add a new column to the DataFrame containing the dates in datetime64 fomat.

officehire_datesalaryhire_date_dt
0Atlanta1-1-20221532022-01-01
1Atlanta15-1-20221432022-01-15
2Boston25-1-20221882022-01-25
3London25-2-20221992022-02-25

Instead of defining a new column, we can simply replace the current column:

hiring['hire_date'] = pd.to_datetime(hiring['hire_date'],  format = '%d-%m-%Y')

Note that by specifying a time format we were able to convert the column to a datetime object that displays the data but not the time.

Convert column to datatime with read_csv

When importing data from a csv file, you can automatically cast specific or multiple columns to the Datatime type. Here’s a simple example in which we use the parse_dates parameter to define the column to be converted.

hiring3 = pd.read_csv('hiring.csv', parse_dates = 'hire_date')

Convert to datetime with astype and keep the dates only

There will be instances in which you have a column that contains date values which contain also hours, minutes and seconds. Let’s quickly add a new column to our DataFrame.

hiring['full_date'] = ['1-1-2022 13:45:00', '15-1-2022 08:45:00', '25-1-2022 12:45:00', '25-2-2022 13:46:00']
hiring['full_date'] = hiring['full_date'].astype('datetime64')

As we are interested to see only the dates, but not the hours, minutes, seconds and so forth, we’ll extract the date only with the dt accessor:

hiring['date'] = hiring['full_date'].dt.date
hiring['date'].head()

Here’s our DataFrame:

officehire_datesalaryhire_date_dtfull_datedate_only
0Atlanta1-1-20221532022-01-012022-01-01 13:45:002022-01-01
1Atlanta15-1-20221432022-01-152022-01-15 08:45:002022-01-15
2Boston25-1-20221882022-01-252022-01-25 12:45:002022-01-25
3London25-2-20221992022-02-252022-02-25 13:46:002022-02-2

Change Datetime format in Pandas columns

After converting our column to the DataFrame object we can easily convert the format of our date objects using the strftime string to datetime formatter.

Here’s a very simple example. Be sure to specify the required formatting for your datetime:

hiring['full_date'].dt.strftime('%d-%m-%y')

Set a datetime index to the DataFrame

We can now go ahead and define one of our datetime columns as the DataFrame index. For example:

hiring.index = hiring['hire_date_dt']

Additional learning