How to check if a value is True in pandas columns?

Checking if any pandas column values are True

We can easily find out whether any values in our Pandas DataFrame column value answers a specific condition using the any() method of a pandas Series. To find our whether any value in your column is greater than a constant, use the following code:

(your_df['your_column'] >= constant).any()

Creating example data

We will start by creating a simple dataset:

import pandas as pd
language = ['R', 'Java', 'Python', 'R', 'Java']
salary = [159.0, 199.0, 102.0, 122.0, 154.0]
data = dict(language = language, salary = salary)
test_df = pd.DataFrame(data=data)
test_df.head()

This will return the following data:

languagesalary
0R159.0
1Java199.0
2Python102.0
3R122.0
4Java154.

Check if any column value is True

We can check whether our Dataset contains Python or R related entries. We’ll specifically check the language Series as shown below:

(test_df['language'].isin (['Python', 'R']))

This will return a boolean array:

0     True
1    False
2     True
3     True
4    False
Name: language, dtype: bool

To check whether each of the values is True we use the Series any() method:

(test_df['language'].isin (['Python', 'R'])).any()

This will obviously return a True result.

Check if values are greater or smaller than a constant in a Series

Let’s check for example, if any of our candidate salaries is higher than 200K. Also here we will use the any() Series method to find any True results.

print((test_df['salary'] >=200).any())

As no salaries are greater than 199K, this will return a False result.

Dealing with multiple conditions

We can check if our DataFrame column values answer several conditions. We will start by defining a more complex boolean condition:

cond = (test_df['language'] == 'Python') & (test_df['salary'] > 100)

We can then slice the DataFrame and find whether at least one of the rows answers the condition:

test_df[cond].any().any()

The result is True.

Verify is a value exists in a Pandas DataFrame

Step # 1: Create Example DataFrame

We will start by creating a simple DataFrame that you can use to follow along with this example.

import pandas as pd
office = ['Toronto', 'Paris', 'Rio de Janeiro', 'Buenos Aires', 'Paris']
salary = [192.0, 217.0, 230.0, 203.0, 117.0]
mydf = pd.DataFrame(dict(office = office, salary = salary))
mydf.head()

Here’s our DataFrame:

officesalary
0Toronto192.0
1Paris217.0
2Rio de Janeiro230.0
3Buenos Aires203.0
4Paris117.0

Step #2: Check if string exists in a DataFrame column

You can use the standard pandas str accessor to check whether a string exists in a column:

search_str = 'Rio'
filt = mydf['office'].str.contains(search_str)
mydf[filt].head()

This returns the following DataFrame subset:

officesalary
2Rio de Janeiro230.0

Step #3: Check if DataFrame row contains value

We now would like to check whether a string exists in any of the DataFrame rows:

search_str = 'Paris'
filt = mydf.apply (lambda row: row.str.contains(search_str).any(), axis = 1)
mydf[filt].head()

This will return the following rows:

officesalary
1Paris217.0
4Paris117.0

Step # 4: Check multiple items are in DataFrame columns

Here we will use the Series isin method, and loop across the columns of our DataFrame (therefore, axis=0), then we’ll filter our DataFrame to show only columns containing values from the list:

search_lst = ['Paris', 'Buenos Aires']
filt = mydf.apply (lambda cell: cell.isin(search_lst), axis = 0)
mydf.loc[:,mydf[filt].any()]
officesalary
1Paris217.0
3Buenos Aires203.0
4Paris117.0

Step # 5: Show cell containing specific value from list

In the next case we would like to find the exact Dataframe cells (row and column intersection) that contain a specific set of values. Here we go:

row = mydf.apply (lambda cell: cell.isin(search_lst).any(), axis = 1)
col = mydf.apply (lambda cell: cell.isin(search_lst).any(), axis = 0)

mydf.loc[row, col]

Here’s the result:

office
1Paris
3Buenos Aires
4Paris

Step #6: Check if a column contains values greater than a constant

For completeness – we can find rows that contains values greater than a specific number . In this example, we will search for all rows with salaries bigger than the mean salary. Here we go:

filt = mydf['salary'] > mydf['salary'].mean()
mydf[filt]