Find standard deviation of Pandas DataFrame columns , rows and Series

In today’s tutorial we will learn how to calculate the standard deviation of a Pandas DataFrame. We’ll calculate the standard deviation for several cases:

  • A Pandas Series
  • One or more DataFrame columns
  • All rows in a Python DataFrame
  • A groupby object

Example DataFrame

We’ll start by importing the Pandas library and reading a csv file with our data into a new DataFrame.

# Import Pandas library
import pandas as pd

# Create DataFrame by reading a csv file
survey = pd.read_csv ('hr_data.csv')

Here’s the DataFrame:

Calculate std deviation of a Pandas Series

In this simple example, we’ll call the std method on one Series (column).

# standard deviation of a series

Standard deviation of one or more DataFrame column

In this case we will calculate the stdv for all or specific columns.

For all the DataFrame:


For specific columns:

We’ll first subset the DataFrame according to specific column labels and then call the std() method.

cols = ['num_cand','avg_salary']

Std deviation for each row in a Python DataFrame

As we would like to calculate the stdev of the rows, we’ll pass the axis=1 parameter.

# standard deviation of each row

Std dev of Pandas Groupby objects

In this example we’ll:

  • First aggregate the data by one (or multiple) columns.
  • Create an aggregated figure, in this case, representing the standard deviation of the salary figures.
# std deviation groupby
data.groupby('language').agg(avg_salary = ('salary', 'std'))

Plotting a standard deviation

If we would like to quickly plot the std dev figures into a simple graph, we can use the Pandas DataFrame.plot() method.

Note that we can also create more sophisticated charts by leveraging the Matplotlib and Seaborn libraries to its full extent.


Here’s our chart: