Drop pandas DataFrame columns not working – how to fix?

In this short tutorial we’ll learn how to troubleshoot issues related one of the most basic tasks when wrangling pandas DataFrames: removing columns off your DataFrame.

Can’t drop pandas columns

Let’s assume that you are working with the following DataFrame:

import pandas as pd

month = ['November', 'December', 'October']
language = ['R', 'Python', 'R']
salary = [ 134.0, 142.0, 165.0]
data = dict(month = month, office = office, salary = salary)
my_df = pd.DataFrame(data=data)
print (my_df.head())

This will return the following data set:

monthofficesalary
0NovemberOsaka134.0
1DecemberLos Angeles142.0
2OctoberNew York165.0

Example #1 – specifying the right axis

Let’s assume that we would like to remove the office column. We will use the drop() DataFrame method:

my_df.drop('office')

This will return a KeyError exception:

KeyError: "['office'] not found in axis"

A DataFrame has two axes: rows (index) and columns. When we invoke the drop() method, pandas searches by default for the specific index to delete in the rows. If we would like to delete a column we simply add the axis=1 parameter:

my_df.drop('office', axis = 1)

Our Jupyter Notebook (or other Python development environment) will return the following output:

monthsalary
0November134.0
1December142.0
2October165.0

Example #2 – Keeping changes with inplace=True

By default, when dropping rows or columns, the changes to the original DataFrame are not kept. If we would like to keep changes, we can either save the DataFrame as a new object or use the inplace=True parameter.

my_new_df = my_df.drop('office', axis = 1)

or alternatively:

 my_df.drop('office', axis = 1, inplace=True)

Example # 3 – drop column not working when importing tsv files

Most often, we acquire our data from external files such as text , csv, Excelor json or databases such as MySQL and sqlite.

I have noticed that errors related to dropping columns are prevalent when working with tab separated values (tsv files) .This is puzzling at times as your file includes the column name but you get a KeyError when trying to get rid of it!

You can solve this by specifying the right separator (using the sep parameter) when invoking the read_csv() method. Here is a simple example:

my_df = pd.read_csv('salaries.tsv', sep='\t', usecols = cols)

Example # 4 – failing to drop multiple columns

Another case is when you try to remove multiple columns. Example:

 my_df.drop('office', 'salary', axis = 1, inplace=True)

This will raise the following Type Error:

TypeError: drop() got multiple values for argument 'axis'

Fixing this is easy – just pass a list of columns to drop:

my_df.drop(['office', 'salary'], axis = 1, inplace=True)

Related learning

Can’t change the label name of pandas columns – how to solve?