How to drop columns by name in pandas DataFrames?

In today’s tutorial we’ll learn how you can drop one or multiple columns based on predefined conditions from your DataFrame.

Define an example DataFrame

We’ll start by importing the pandas library (look here if you are encountering pandas import errors); and then create a very simple DataFrame populated with some candidate data:

import pandas as pd

month = ['November', 'June', 'October']
language = ['Python', 'Javascript', 'Python']
office = ['Toronto', 'Los Angeles', 'Bangkok']
salary = [188.0, 172.0, 116.0]
salaries = dict(month = month, office = office, prog_language = language, monthly_salary = salary )
interviews_df = pd.DataFrame(data=salaries)
interviews_df.head()

Delete a single column by name

The easiest case, is to drop a single column off the DataFrame:

# define column to remove
col = 'office'

#remove the column and assing to a new DataFrame
interviews_df_1 = interviews_df.drop(col, axis=1)

Note: When calling the drop method, you can invoke the inplace=True parameter to persist your changes (in this case – the column removal) in the DataFrame.


interviews_df.drop(col, axis=1, inplace=True)

Unable to delete columns

If removing a column from your DataFrame doesn’t seem to be working for you, most probably you are missing one of these two:

  • You drop the column from the DataFrame, but when visualizing its contents, you still see the column you dropped. If that’s the case, remember to either assigned the modified DataFrame to a new one, or use the inplace=True parameter mentioned above.
  • You get the following error:
KeyError: "<your column name> not found in axis"

The problem here is that pandas is looking for specific column names in the rows axis. The solution is to use the axis=1 parameter, as shown in the examples throughout this tutorial.

Remove columns if exist in the DataFrame

Next example, is that we’ll trigger the column deletion only if the specific object is part of the columns index.

To display the list of columns in our DataFrame, we use the following snippet. The result is an Index object.

interviews_df.columns

We can then write a very simple conditional statement to trigger the column removal if part of the index:

col = 'office'
if col in (list(interviews_df.columns)):
    interviews_df_2 = interviews_df.drop(col, axis=1)

Delete a column if matches a certain pattern

In this example we’ll use a list comprehension to loop through the column index and construct a list object that has the name of the DataFrame cols that matches our pattern. Then we will go ahead and remove those.

pattern = 'month'
drop_lst = [col for col in (interviews_df.columns) if col.find(pattern)>-1]
interviews_df_4 = interviews_df.drop(drop_lst, axis=1)

Remove columns which names starts with a specific string

In a similar fashion, we can search for specific column names starting with a provided string and wipe them off our DataFrame.

pattern = 'month'
drop_lst = [col for col in (interviews_df.columns) if col.startswith(pattern)]
interviews_df_3 = interviews_df.drop(drop_lst, axis=1)

Recommended learning

How to remove rows with empty values from your DataFrame?