How to check if one or multiple Pandas rows exist?

Step 1: Prepare your Data

We will first create a couple of a simple DataFrame:

import pandas as pd 

hired = pd.DataFrame(
                    {'area': ['Python', 'DAX', 'Power BI'],
                    'applicants' : [110, 100, 130],
                     'hired' : [11, 10, 13]
                     })

Step 2: Check if a single row exists in your DataFrame

Let’s assume we have a simple list we want to add as a row to our DataFrame.

new_row = ['Python', 110, 11]

Using the loc accessor we can check whether there is a row which already contains same values in your DataFrame.

if hired.loc[(hired['area'] == 'Python') & (hired['applicants'] == 110) & (hired['hired'] == 11)].empty is True:
    print('Your df doesn\'t contain your row')
else:
    print('Your df contains your row')

Step 3: Check if multiple rows exists in another DataFrame

Let’s define a second DataFrame structure:

applications = pd.DataFrame(
                    {'area': ['Python', 'R', 'Power BI'],
                    'applicants' : [110, 120, 130],
                     })

A simple way to check whether rows exists is to merge the two DataFrames into a new one:

apps_vs_hired = pd.merge(applications,hired, on = ['area', 'applicants'], how = 'outer', indicator='exists_yes_no')

Let’s look at the new DataFrame:

apps_vs_hired.head()
areaapplicantshiredexists_yes_no
0Python11011.0both
1R120NaNleft_only
2Power BI13013.0both
3DAX10010.0right_only

The exists_yes_no column depicts whether all rows in both DataFrames exists in each other.

We can make the column values a bit more legible by mapping them to simpler to understand values using a Python dictionary:

exist_dict = {'left_only': 'Row exists in Applicants only',
             'right_only': 'Row exists in Hired only',
             'both': 'Row exists in both'}

apps_vs_hired['exists_yes_no'] = apps_vs_hired['exists_yes_no'].map(exist_dict)

We can now take a look at our DataFrame and clearly understand whether columns appear in any or both of the datasets:

areaapplicantshiredexists_yes_no
0Python11011.0Row exists in both
1R120NaNRow exists in Applicants only
2Power BI13013.0Row exists in both
3DAX10010.0Row exists in Hired only