How to update row values based on condition in pandas DataFrames?

Today we would like to show how to change cell values based on specific conditions in a pandas DataFrame.

We will start by creating some example data, that you can use if order to follow along with this example:

rev_dict = {
             'date' : pd.date_range('1/1/2023', periods=4, freq='MS') ,       
             'team': ['B', 'A', 'A', pd.NA],  
             'direct_sales' : [303, 234, 302, 461],
             'indirect_sales' : [199, 175, 108, 142],
             }

revenue = pd.DataFrame (rev_dict)
print( revenue.head())

This will return the following DataFrame rows:

dateteamdirect_salesindirect_sales
02023-01-01B303199
12023-02-01A234175
22023-03-01A302108
32023-04-01<NA>461142

Change empty / missing values in our columns

We clearly see that this last DataFrame rows contain an empty value. We can use the fillna() function to replace values in the team column.

revenue['team'] = revenue['team'].fillna('A')

Modify value based on condition

Next we will go ahead and adjust the value of specific row cell based on a condition.

# define condition
rows = (revenue.indirect_sales > 150)

# update dataframe
revenue.loc [rows, 'team'] = 'B'

print ( revenue.head() )

Here’s our DataFrame:

dateteamdirect_salesindirect_sales
02023-01-01B303199
12023-02-01B234175
22023-03-01A302108
32023-04-01A461142

Apply a function to set values based on multiple conditions

In this example, we will define a simple function that will determine values based on row values. We will then apply the function as needed to the DataFrame column.

# define a Python function
def set_team(r):
    if (r['indirect_sales'] > 140) | (r['direct_sales'] > 400):
        return 'B'
    else:
        return 'A'

# apply function to column
revenue['team']  = revenue.apply(lambda r: set_team(r), axis=1)

Update values in multiple rows of a column

Next case is to adjust values of all cells in a column by multiplying the pandas column values by a constant. In this example, we will augment the indirect_sales column by 15%.

revenue.indirect_sales = revenue.indirect_sales *1.15

Next Learning

How to slice all columns but the first in a pandas DataFrame?