How to subset Dataframe rows by multiple conditions and columns with the loc indexer in Python?

In today’s quick tutorial we’ll learn how to filter a Python Pandas DataFrame with the loc indexer. Specifically we will look into sub-setting data using complex condition criteria beyond the basics.

Data Preparation

#Import the Pandas library
import pandas as pd

#Initialize the DataFrame
hr = pd.read_csv('interviews.csv')

# Browse the first DF rows
hr.head()

Here’s our dataframe:

languagemonthsalarynum_candidatesdays_to_hire
1PHPApril128.074.056.0
2C#February97.083.046.0
3RJanuary128.086.057.0
4SwiftOctober123.080.076.0
5SwiftDecember108.081.060

Select rows by conditions with iloc

We will start by writing a simple condition. Let’s assume that we ant to filter the rows realted to the Swift language.

subset = (hr['language'] == 'Swift')

# using the loc indexer
hr.loc[subset]
# using the brackets notation
hr[subset]

Both will render a similar result:

languagemonthsalarynum_candidatesdays_to_hire
4SwiftOctober123.080.076.0
5SwiftDecember108.081.060.0

Filter row by multiple conditions with loc

Let’s write a bit more complex statement, Instead of showing all columns, we’ll subset / select specific columns:


# Define a query string with isin
subset = hr['language'].isin (['PHP', 'R'])

# filter specific columns
cols = ['language', 'salary', 'days_to_hire']

Now we’ll pass the two parameters to the loc indexer on our DataFrame

hr.loc[subset, cols]

The result will be:

languagesalarydays_to_hire
1PHP128.056.0
3R128.057.0

Writing complex conditions (OR and AND)

In this last section will show you how to concatenate conditions using the & and | operators.

Condition based on OR operator:

subset = (hr['language'] == 'PHP') | (hr['salary'] < 128)
hr[subset]

Condition based on AND operator:

subset = (hr['language'] == 'PHP') & (hr['salary'] == 128)
hr[subset]

Drop DataFrame rows based on multiple conditions

We first define the complex condition. Then call the DataFRame drop method on the DF index.

subset = (hr['language'] == 'PHP') & (hr['salary'] == 128)
hr_new = hr.drop(hr[subset].index)
hr_new