How to turn a Pandas groupby to a dataframe?

While analyzing data you typically group data using the very handy Pandas groupby method. But what if once data is grouped, you would like to continue your analysis and probably visualize your data. If that is the case, then you might want to convert your Groupby results from a Series to a DataFrame, so that you can take advantage of its flexibility and great features.

Creating example Data

We’ll start by creating data and grouping it using the groupby DataFrame method

import pandas as pd

# define the dataset
language = ['VisualBasic', 'VisualBasic', 'Java', 'Java', 'VisualBasic', 'Java', 'C#', 'VisualBasic']
office = ['Atlanta', 'New York', 'New York', 'Boston', 'Atlanta', 'New York', 'Atlanta', 'New York']
salary = [109.0, 110.0, 106.0, 107.0, 110.0, 113.0, 128.0, 131.0]
salaries = dict(month=month, language=language, office=office, salary = salary)
hiring = pd.DataFrame(data=salaries)

# grouping the data
hiring_gp = hiring.groupby(['language','month'])['salary'].mean()

We now have a Series that contains the grouping results.

Now we would like to return a DataFrame object out of the results of our grouping.

Convert the Groupby to a DataFrame with to_frame()

The first option to convert the grouped data to a DataFrame is using the Series method to_frame(). Here’s the snippet:


and the result:

Turn Series to DataFrame with unstack

We can use the to_frame() method that we just reviewed in order to convert Series objects to Dataframes.

Another useful method to achieve that is to use the unstack Series method:


And here is the result:

Group data directly into a DataFrame with groupby.agg

We can obviously aggregate data directly into a DataFrame, using the groupby.agg method:

hiring.groupby(['language','month']).agg (avg_salary= ('salary', 'mean'))

Groupby DataFrame by all columns (or multiple ones)

Another question we typically get is how to groupby DataFrame data by multiple columns (or even all columns). If you are in search for a solution for that look into this post on multiple column grouping in Pandas.

Pandas groupby DataFrame to csv

We can easily write our data to a comma separated value (CSV) file.


Converting the GroupBy to Excel

We can easily use the Series to_excel method to convert the grouped data to Excel:


Alternatively we can first create a DataFrame and use a similar to_excel method for DataFrames

import openpyxl

You might want to look into a more comprehensive tutorial on how to export Pandas DataFrame to Excel workbooks.

Next Learning