In this short tutorial we’ll learn how to delete one or multiple DataFrame columns using their respective columns index.
Example data
Let’s start by defining a simple example DataFrame:
import pandas as pd
hr_dict = {'city': ['Atlanta', 'Boston', 'Charlotte','New York'],
'branch': [1,2,1,3],
'num_candidates': [100,85,84, 78],
'num_interviews': [75,60,45, 65]}
hiring = pd.DataFrame(hr_dict)
Let’s look at the column index. Type the following code:
type (hiring.columns)
You’ll get back the column index, which we’ll use to write code for the use cases we’ll cover.
Index(['city', 'branch', 'num_candidates', 'num_interviews'], dtype='object')
Drop one column by index in Pandas
In this and the next cases, we will use the DataFrame.drop() method. We’ll start by removing one column from the DataFrame by index (instead of by name / label).
cols = hiring.columns[0]
hiring.drop(columns =cols)
This will drop the first column (in our case, city).
Persisting your changes
Note that simply invoking the DataFrame.drop() method won’t make permanent changes to your DataFrame structure. If you want to persist any changes you should either use the inplace=True parameter or create a new DataFrame. Here’s a quick example:
cols = hiring.columns[0]
# Persisting changes - method 1
hiring.drop(columns =cols , inplace=True)
hiring.head()
# Persisting changes - method 2
hiring1 = hiring.drop(columns =cols)
hiring1.head()
Both will render the same result:
Delete multiple columns by index
In this case, we would like to remove several columns. We’ll slice the column index appropriately and get rid of the second and third columns (indices 1 and 2).
# drop multiple columns by index
cols = hiring.columns
hiring.drop(columns =cols[1:3], inplace=True)
Here’s our result:
Delete last row by index
hiring.drop (columns = hiring.columns[-1], inplace=True)
Drop all DataFrame rows
For completeness, i am posting also code to remove all the columns in a Python DF.
# drop all columns
hiring.drop(columns = hiring.columns, inplace=True)
Drop row by index
You can obviously remove not only columns but also rows. In this case we’ll pass the list of indices to remove using a list into the index parameter.
# drop the third and fourth rows by index
idx = [2,3]
hiring.drop(index=idx)
Here’s our DataFrame: