How to group by month in Pandas dataframes?

Groupby pandas dataframe data by month

Use the dt.month accessor on your date column to group your dataframe data according to a specific month. For example:

my_df.groupby(['my_date_col'].dt.month).sum()

Example data

We will start by importing the pandas library and create a a very simply DataFrame that you can use in order to follow this example.

import pandas as pd
# create a time series
dates = pd.date_range(start='11/27/2022', periods = 6, freq = 'B' )
interviews = [61, 139, 56, 124, 87, 85]

campaign  = pd.DataFrame (dict (date = dates, num_interviews = interviews))
campaign .head()

Let’s look into our data:

datenum_interviews
02022-11-2861
12022-11-29139
22022-11-3056
32022-12-01124
42022-12-0287

Grouping data by month

Now that we have the data, we are able to easily aggregate it by any time series / period and / or another columns.

In this snippet we will calculate the average number of interviews that our HR team has conducted – by month. We will use the Series dt accessor to display the month number (dt.month).

(campaign
.groupby([campaign['date'].dt.month])
.agg (avg_interviews_month = ('num_interviews' , 'mean'))
.round())

This will render the following result:

avg_interviews_month
date
1185.0
1299.0

Aggregating by month with pd.Grouper

We can achieve somewhat similar results by using a pandas grouper object:

(campaign
.groupby([pd.Grouper(key = 'date', freq = 'M')])
.agg (avg_interviews_month = ('num_interviews' , 'mean'))
.round())

The data is grouped by default by the last day of the month:

avg_interviews_month
date
2022-11-3085.0
2022-12-3199.0

Groupby month and year and then sum

In the same fashion we can aggregate by month and year. We will use the dt accessor and the strftime formatter to accomplish this. Once the data is aggregated we will apply the sum operation to add up the different row values.

(campaign
.groupby(campaign['date'].dt.strftime('%m-%y'))
.agg (total_interviews = ('num_interviews' , 'sum')))

This will render the following result:

total_interviews
date
11-22256
12-22296

Aggregate by month name

Last case for today is to groupby our data by month name. This is straightforward using the month_name() function made available by the dt accessor. Don’t forget the parentheses.

(campaign
.groupby(campaign['date'].dt.month_name())
.agg (total_interviews = ('num_interviews' , 'sum')))

Next Learning

How to aggregate pandas data by one or multiple column values?