Solving Keyerror exceptions in pandas
Most probably the reason you are getting a KeyError exception when working with a pandas DataFrame is that you have a typo in a column or row label name. When in doubt, make sure to check the correct column name using the following commands below:
print( your_df.columns) # for columns
print(your_df.index) # for row indexes
Define an Example DataFrame
Let’s start by creating a very simple DataFrame that you can use to follow along this tutorial. Feel free to use the following snippet in your Jupyter notebook, or Python script:
import pandas as pd
month = ['November', 'March', 'December']
language = ['Javascript', 'R', 'Java']
office = ['New York', 'New York', 'Los Angeles']
salary = [155.0, 137.0, 189.0]
hiring = dict(month=month, language = language, salary = salary)
hrdf = pd.DataFrame(data=hiring
Key error not found in axis exception
Let’s assume that we would like to drop one or more columns from ours DataFrame. We’ll purposely make a spelling mistake in the column name – instead of salary we’ll write salaries.
hrdf.drop('salaries')
Pandas will throw the following exception:
KeyError: "['salaries'] not found in axis"
Reason is simple: we have a typo in the column name. If in doubt about your column label value, simply use the columns() property:
print( hrdf.columns)
This will return:
Index(['month', 'language', 'salary'], dtype='object')
All we need now is to fix the column name:
hrdf.drop('salary')
Key error not in index pandas
Another very similar error happens when we try to subset columns or rows from our DataFrame, and accidentally have a typo in one or more of our row or column label names. In the following example we would like to select a couple of columns from our DataFrame:
subset = hrdf[['language', 'salaries']]
This returns an exception. Fixing the typo will do the trick.
subset = hrdf[['language', 'salary']]