How to replace zero with specific values in Pandas DataFrames columns?

As part of your data cleaning tasks, you might need to process large amounts of data that is missing or somewhat flawed. In this short tutorial we will demonstrate several usages of the Python Pandas replace method to change zero values in specific DataFrame columns.

Replace zero values in Pandas columns

In Pandas, you can use the DataFrame and Series replace() function to modify the content of your DataFrame cells. For example, if your DataFrame name is my_df, you can use the following code to change all cells containing zeros to empty values.

my_df.replace(to_replace = 0, value = '', inplace=true)

Change cells with zero values – practical example

Let’s get started with creating a simple Python DataFrame:

# Import Pandas library

import pandas as pd

# Define DataFrame columns as lists
city =  ['Atlanta', 'Barcelona', 'Atlanta','Paris']
skill =  ['Python', 'R', 'Scala', 'Python']
salary = [102,104,120,0]

# Initialize DataFrame
salaries = dict(city=city, skill=skill, salary=salary)
campaigns = pd.DataFrame (salaries)

#display DataFrame

Here’s our DataFrame content:


Replace zero value with the column mean

You might want to replace those specific missing values with the average value of your DataFrame column. In our case, we’ll modify the salary column. Here is a simple snippet that you can use:

salary_col = campaigns['salary']
salary_col.replace(to_replace = 0, value = salary_col.mean(), inplace=True)

Here’s our DataFrame:


Important note: By default, the replace method doesn’t persist changes in your DataFrame. We used the inplace=True to ensure changes made are permanent.

Replacing 0 with NAN blank values

In the same fashion we can change zero values to pd.NA, np,nan or None values by using the following code:

campaigns['salary'].replace(to_replace = 0, value = pd.NA, inplace=True)

Change zero values to the previous column value

By using the method parameter, we can specify a different replacement strategy for our cell values. In our case we will method=’ffill’ . Conversely, we can use method=’bfill’ to assign the next value to our cell.

salary_col = campaigns['salary']
salary_col.replace(to_replace = 0,  method='ffill', inplace=True)

Replace specific values based on condition

We can also use the loc accessor to modify zero cell values as shown below. In our case, we’ll set the missing values to be equal to the column maximal value.

condition  = (campaigns['salary'] == 0 )
campaigns.loc[condition, 'salary'] = campaigns['salary'].max()

We can also modify values use the iloc accessor:

condition  = (campaigns['salary'] == 0 )
campaigns.iloc[condition, 2] = campaigns['salary'].max()

Replace column cells with one value

If we would like to replace all column rows with a specific value, we will use the following snippet:

campaigns = campaigns.assign(salary = 120)

Next Learning