How to find the max of two or more columns in Pandas?

There might be cases in which you’ll need to perform calculations across several columns in your Pandas DataFrame. In this post we’ll learn how to easily use the max() DataFrame to retrieve the maximum value across two or more Pandas column.

Example DataFrame

We’ll get started with a very simple DataFrame that you can use in your Python environment in order to follow along with this example.

import pandas as pd
sales_dict = {'Year':['2019', '2020', '2021', '2022'], 'sales_Q1':[150, 250, 175, 190], 'sales_Q2': [ 160, 270, 220, 185], 'sales_Q3' : [ 285, 210, 290, 400] }

sales = pd.DataFrame (sales_dict)

sales

Here’s our data:

Yearsales_Q1sales_Q2sales_Q3
02019150160285
12020250270210
22021175220290
32022190185400

Calculate the maximum value of two columns

As a simple example, we will calculate for example the maximum of the Q1 and Q2 columns.We first need to subset our DataFrame and apply the max() function. There are several ways to subset a DataFrame.

Using the brackets notation:

subset = sales[['Q1', 'Q2']]

Using the loc and iloc indexers:

# using index location

subset = sales.iloc[:,1:3]

#or using loc

subset = sales.loc[:,['Q1', 'Q2']]

Note that you can use the same method to calculate the maximum across any number of columns. Now we’ll find the maximal value:

sales['max_two_cols'] = subset.max(axis=1)

Note the usage of the axis=1 parameter to ensure that the max calculations is done across the columns (instead of across the DataFrame rows).

Here’s our DataFrame:

Yearsales_Q1sales_Q2sales_Q3max_q1_q2
02019150160285160
12020250270210270
22021175220290220
32022190185400190

Maximum of Pandas column and constant

In a similar fashion we can find the maximum value between a constant and a column. We’ll compare between the Q1 figure and a quarterly target expense.

Method A – using numpy.maximum()

import numpy as np
q_expense = 210
sales['max_q1_q_expense'] = np.maximum(sales['sales_Q1'], q_expense)

Method B – using two Pandas columns

sales['q_expense'] = 210
sales['max_q1_q_expense'] = sales[['sales_Q1', 'q_expense']].max(axis=1)

And our DataFrame will look as following:

Yearsales_Q1sales_Q2sales_Q3max_q1_q2q_expensemax_q1_q_expense
02019150160285160210210
12020250270210270210250
22021175220290220210210
32022190185400190210210

Related learning