How to delete one or multiple rows by index, condition and value in Pandas DataFrames?

In today’s Data Analysis tutorial we’ll learn how to easily remove one or multiple rows from a Python DataFrame.

We’ll look into several cases in which we’ll use the pd.DataFrame.drop() method in order to remove irrelevant rows from our data:

  1. Drop row by index / row number.
  2. Drop multiple rows
  3. Delete row based on condition
  4. Delete row if its empty / null /nan

Creating test data set

We’ll start by importing the Pandas library into our Jupyter Notebook, and create some test data.

import pandas as pd

city =  ['New York', 'Boston', 'Atlanta','New York']
office =  ['West', 'South', 'South', 'East']
interviews = [90,89,100,pd.NA]

# create dictionary
offices = dict(city=city, office=office, interviews=interviews)

# create DataFrame from dictionary
hr = pd.DataFrame (offices)

hr.head()

Here’s our DataFrame:

Drop row by index

In the first case, we would like to pass relevant row index labels to the DataFrame.drop() method:

rows = 3
hr.drop(index=rows)

This will remove the last row from our DataFrame.

Deleting multiple rows by index

We can obviously get rid of multiple rows by passing a list of row labels:

rows = [2,3]
hr.drop(index=rows)

Persisting changes

If you want to permanently save the changes you have done to your DataFrame, simply use the inplace=True parameter.


hr.drop(index=rows, inplace=True)

You can also save your modified DataFrame as a new one, and persist changes you have made:


rows = [2,3]

hr1 = hr.drop(index=rows)

Remove a Pandas DataFrame the first row

After importing our DataFrame data from an external file (such as csv, json and so forth) or a sql database, we might want to get rid of the header row. You can do that by tweaking your data import code or use something simple such as:

# drop first row

hr.drop(index=0)

Drop rows based on conditions

Let’s now assume that we want to filter our specific rows out of our DataFrame based on conditions. In our case we’ll want to remove rows pertaining to offices which are not based in NYC

filt = hr[hr['city'] != 'New York'].index

hr_new_york = hr.drop(index=filt)

We could have filtered the DataFrame more easily by using the brackets notation:

hr_new_york  = hr[hr['city'] == 'New York']

Both commands will render the same result:

Delete rows with empty (nan values)

We have a complete tutorial on this topic which you might want to look at. Removing rows with null values in Pandas

Removing columns

We have several tutorials on deleting specific columns from your DataFrame: