How to calculate the average of one or more columns in a Pandas DataFrame?

In today’s quick tutorial we’ll using Python and the Pandas library to calculate the mean of one or more columns in a Pandas DataFrame.

Let’s get started by prepping our test DataFrame. As usual, we’ll use the auto-generated candidates data.

Here we go:

import pandas as pd
data = pd.read_csv('survey.csv')

print(data)

Here’s the result (note that you can copy and paste the following data and use the pd.read_clipboard() method to populate your own dataframe and follow along.

monthsalarynum_candidates
1April118.083.0
2February127.080.0
3May122.075.0
4July146.082.0
5September122.079.0
6February130.090.0
7July118.073.0
8November116.077.0
9February114.088.0
10October147.078.0

Find the mean / average of one column

To find the average of one column (Series), we simply type:

data['salary'].mean()

The result will be 126.

Calculate mean of multiple columns

In our case, we can simply invoke the mean() method on the DataFrame itself.

data['salary'].mean()

The result will be:

salary            126.0
num_candidates     80.5
dtype: float64

Chances are that your DataFrame will be wider, and contains several columns. In that case, we’ll first subset our DataFrame by the relevant columns and then calculate the mean.

cols = ['salary', 'num_candidates']

data[cols].mean()

The result will be similar.

Moving on: Creating a Dataframe or list from your columns mean values

You can easily turn your mean values into a new DataFrame or to a list:

data_mean = pd.DataFrame(data.mean(), columns=['mean_values'])

#create list of mean values
mean_list = data.mean().to_list

Or even a simple bar chart that you can use in a PowerPoint deck:

data.mean().plot(kind='bar');

Here’s the chart:

Calculate the mean of you Series with df.describe()

We can use the DataFrame method pd.describe to quickly look into the key statistical calculations of our DataFrame numeric columns – including the mean.

data.describe().round()

And the result: