How to divide DataFrame columns in Pandas?

In previous tutorials, we learnt how to sum and multiply columns values in Pandas. Today, we would like to discuss several cases related to the application of the division arithmetic operation in Pandas DataFrames.

In this post, we will cover step-by-step process to divide a column:

  1. By value / constant / scalar
  2. By other column
  3. By sum of a column
  4. By its first value

Create an example Pandas DataFrame

We will start by creating a very simply DataFrame with some data that you can use to follow along the example.

# Python3
import pandas as pd
employee = ['Dorothy', 'Sarah', 'Liam', 'Larry']
salary = [183, 48, 92, 181]
bonus = [5,6,4,7]

hr = pd.DataFrame (dict(employee = employee, salary = salary, bonus = bonus))

hr.head()

Here’s our DataFrame data:

Divide columns by constant / scalar / value

Our first example is just divide a DataFrame column by a constant value. In our case, we’ll just go ahead and calculate the monthly salary of each employee. Here’s the code you’ll need to accomplish that

# By value / constant
num_months = 12
hr['monthly_salary'] = (hr['salary'] / num_months).round(2)

hr.head()

A new column will be added to our DataFrame:

Divide a DataFrame column by other column

Another common use case is simply to create a new column in our DataFrame by dividing to or multiple columns. In this case, we’ll calculate the bonus percentage from the annual salary. Here we go:

# division by other column
hr['bonus_pct'] = (hr['bonus']/ hr['salary']*100).round(2)

hr.head()

Here’s the resulting DataFrame:

Calculate percentage from sum of a column

In the next example we’ll simply divide column values by its sum. This is helpful in order to calculate percentages.

max_sal = hr['salary'].sum()
hr ['sal_pct'] = (hr['salary'] / max_sal *100).round(2)

hr.head()

Here’s the resulting DataFrame:

Divide column rows by first value

Last example is when we just want to divide all column values by the first value. We can obviously apply the same logic to divide by the maximum, minimum, average, std deviation and so forth of the column.

first_val = hr['salary'][0]

hr ['sal_pct'] = (hr['salary'] / first_val).round(2)

Divide by zero error in Pandas

There might be cases in which your denominator column value will be equal to zero. The result will trigger an infinite value, displayed as inf in your DataFrame.

You might want to convert the inf values to empty values: None, np.nan or pd.NA. If so, use the following code (make sure that you replace the col_name placeholder with your relevant column name.

hr['col_name'] = \
(hr['col_name']
.where(~np.isinf(hr['col_name']),pd.NA))

This will keep all values which are not infinite and replace the ones that are with pd.NA.

Additional learning: