How to turn a Pandas groupby to a dataframe?

While analyzing data you typically group data using the very handy Pandas groupby method. But what if once data is grouped, you would like to continue your analysis and probably visualize your data. If that is the case, then you might want to convert your Groupby results from a Series to a DataFrame, so that you can take advantage of its flexibility and great features.

Creating example Data

We’ll start by creating data and grouping it using the groupby DataFrame method

import pandas as pd

# define the dataset
language = ['VisualBasic', 'VisualBasic', 'Java', 'Java', 'VisualBasic', 'Java', 'C#', 'VisualBasic']
office = ['Atlanta', 'New York', 'New York', 'Boston', 'Atlanta', 'New York', 'Atlanta', 'New York']
salary = [109.0, 110.0, 106.0, 107.0, 110.0, 113.0, 128.0, 131.0]
salaries = dict(month=month, language=language, office=office, salary = salary)
hiring = pd.DataFrame(data=salaries)

# grouping the data
hiring_gp = hiring.groupby(['language','month'])['salary'].mean()
print(hiring_gp)

We now have a Series that contains the grouping results:

Convert the Groupby to a DataFrame with to_frame()

The first option to convert the grouped data to a DataFrame is using the Series method to_frame(). Here’s the snippet:

hiring_gp.to_frame()

and the result:

Turn Series to DataFrame with unstack

We can also use the unstack Series method:

hiring_gp.unstack(level='month')

And here is the result:

Group data directly into a DataFrame with groupby.agg

We can obviously aggregate data directly into a DataFrame, using the groupby.agg method:

hiring.groupby(['language','month']).agg (avg_salary= ('salary', 'mean'))

Pandas groupby DataFrame to csv

We can easily write our data to a comma separated value (CSV) file.

hiring_gp.unstack(level='month').to_csv('hiring.csv')

Converting the GroupBy to Excel

We can easily use the Series to_excel method to convert the grouped data to Excel:

hiring_gp.to_excel

Alternatively we can first create a DataFrame and use a similar to_excel method for DataFrames

import openpyxl
hiring_gp.unstack(level='month').to_excel('hiring.xlsx')

You might want to look into a more comprehensive tutorial on how to export Pandas DataFrame to Excel workbooks.

Next Learning