How to group data by quarter in pandas?

Our task for today is to aggregate data by quarter in a pandas dataframe.

Example data

Let’s start by importing the pandas Data Analysis library; and then creating a very simple pandas DataFrame:

import pandas as pd

stamps = pd.date_range(start='6/28/2023', periods = 6, freq = 'B' )
sales = [122, 124, 106, 101, 145, 100]

revenue  = pd.DataFrame (dict (date = stamps, sales = sales))
revenue.head()

Here’s our data:

datesales
02023-06-28122
12023-06-29124
22023-06-30106
32023-07-03101
42023-07-04145

Grouping the data by quarter

We can use the dt.quarter property in order to then aggregate the data rows by quarter:

revenue_quarter = (revenue
.groupby([revenue['date'].dt.quarter])
.agg (sales_by_quarter = ('sales' , 'sum'))
)
# optionally, rename the index
revenue_quarter.index.name = 'quarter'
print (revenue_quarter)

This will result in the following frame:

sales_by_quarter
quarter
2352
3346

Aggregating by quarter and year

Next case is to aggregate the data by quarter and year. We will first define a new column that represent the quarter-year value for every row in our DataFrame and group by accordingly:

# define a quarter-year column
revenue['quarter-year'] = revenue['date'].dt.quarter.map(str) + '-' +revenue['date'].dt.year.map(str)

#group the data
(revenue
.groupby(revenue['quarter-year'])
.agg (sales_by_quarter = ('sales' , 'sum')))

Here’s our grouped data:

sales_by_quarter
quarter-year
2-2023352
3-2023346

Aggregate dataframe columns with a pd.grouper

We can also use a pd.grouper to aggregate our data to the last quarter day:

(revenue
.groupby([pd.Grouper(key = 'date', freq = 'Q')])
.agg (sales_by_quarter = ('sales' , 'sum')))

Exporting your groupby to a csv file

Once the data is aggregated, you can easily export it to a csv file:

revenue_quarter.to_csv('path_to_your_csv_file')

Suggested Learning

How to concatenate multiple pandas DataFrames with Python?